Investigating DAX
  • Home
  • About me
  • Articles
  • Contact

A bit of advertising ;)

Posted Jan.20, 2012 by denisfed, under Non Technical

As you,probably, already know, Nigel Frank International now is conducting Dynamics Salary Survey

Of course, primary goal for these guys (or these ladies – looks more like it) is to collect HR-database with contact of potential hires, as they always asking you to fill out your contact data in the end of survey. Yet the results of of the survey, which they promise to send everyone, who submit this survey can be very interesting for everyone in Dynamics AX community.  I have filled out this questionnaire already, because I am interested in three things:

  1. Quite naturally, I want to know which country has highest salary for a Dynamics AX specialist.  ;) Especially, after this salary is normalized by,say, Expatistan cost of living index
  2. I want to figure out the proportion between cost of software (Dynamics AX in our case) and cost of implementation. Say, 11 years ago, when I started my career as an Axapta specialist, typical DAX project budget in Russia was around $100-120K for software and about $80-90K for for consulting. Now, after 10 years of high oil prices and consequential dollar wages increase, average salary in Dynamics AX market in Russia grew four-folds. Now, even with considerable increase in Dynamics AX price per user seat, cost of consulting typically is about 200% of software purchase costs. I am wondering, which country has highest, say, highest annual-wage to user-license-price ratio and which country – lowest. Actually, it looks a bit peculiar for me, that Microsoft use almost the same sales price for all countries regardless of their GDP per capita and average salary. I guess, if Microsoft pricing policy would be more flexible, they could greatly increase their user base in less-well-to-do countries. It might pay pretty well in future, but it requires more strategic kind of thinking then typical Microsoft executive have…
  3. I am also want to compare an average Dynamics CRM salary year-over-year. I think, we are witnessing a kind of commoditization of Dynamics CRM market. In most cases, Dynamics CRM is used simply as development environment to make ANY kind of custom development. No significant knowledge in consulting in general and sales management principles in particular is required to enter this market. Most of Dynamics CRM specialists (at least, people I had chance to meet and work with) are techies, who studied Dynamics CRM in a row of other Microsoft technologies (somewhere after ASP.NET and MS SQL, but before Biztalk). So, I expect drop in average Dynamics CRM salaries in future. I recently red that Dynamics CRM revenue for Microsoft has increased by 30% in last quarter. It might be a good indication of forthcoming salary drop. Demand for CRM specialists will increase sharply and then drop sharply as well. Sooner or later, knowledge of Dynamics CRM can become kind of commodity knowledge, like knowledge of C# or MS SQL. It can still bring money, but  this money won’t be as good as 2-3 years ago…

So, I hope that this survey will be representative enough to provide every participant with relevant data. That’s why it is so important for everyone to fill it out…

Regards,

Denis

 

Leave a Comment

Multithreaded MRP and related issues

Posted Nov.07, 2011 by denisfed, under BOM, BOM Level, Infrastructure, MRP, Performance

Today, I will share some experience related to setting up and fine tuning MRP for best performance.

The most significant MRP-related improvement of DAX2009 over older versions is an ability to execute MRP in multi-threaded mode, when planning process is spread over several independent processes executed in parallel. As usual, multi-threaded execution works only when MRP is executed in batch mode, since helper threads are spawned as additional tasks inside the same batch job. This is the current way to support multi-threaded execution in Axapta.

Basic setup

The number of helper threads is specified on the second tab (Scheduling Helpers) of MRP start menu parameters. What number of helpers should be used ? To my experience, MRP scales very well. On my current project, we use 15 helpers (and 1 main thread). We tried to increase number of helpers to 31, but have not found any significant performance difference. Generally, it depends on configuration of your batch server(s) and your database server.When you are trying to increase the number of helper threads, you should check resource utilization for both batch server and database server. If one of these servers becomes saturated after yet another increase in helper threads number, then it is time to stop or, maybe, time to upgrade your hardware. I also want to mention, that I have positive experience of allocating MRP onto several batch servers. I would say, that my personal recommendation is as the following: First check every batch server, which will participate in MRP for the number of CPU cores it have. Then configure every batch server to run 2xNumber_Of_Cores batch threads. (In Administration->Setup->Server Configuration). Then specify number of helper as total number of batch threads for  all batch servers serving your MRP batch group MINUS 1.  Remember, that main batch thread also consumes one thread from your batch server group capacity. Also, if during MRP run, you are going to run some other batch processes on the same batch server(s), you may want to decrease the number even further, so maybe you should set number of helper-threads to total thread capacity minus 2 or 3.

Next important parameter to discuss is Item Distribution factor. What does it mean? On the very first stages of MRP (In Update
phase to be precise), the system allocates all items (all items in inventTable or all items in inventTable, which fit to the query specified in MRP startup dialog box) into chunks. Every chunk is a kind of unit of work, which is processed by one thread. During the phases of MRP, which are executed in parallel (Data regeneration, Coverage planning, Futures planning and Action Messages), every thread grabs chunk, process it, then grab another chunk, process it and so on, until chunks for given stage and BOM level are marked as processed.

Size of the chunk is calculated as Item_Distribution_Factor*(Number_Of_Helpers+1).

From one side, smaller chunks ensure smooth processing and even allocation of items between threads. The smaller chunk is, the less are chances for a thread to grab a chunk with high number of computational intensive items. Say, one item takes from 1 to 10 seconds to be coverage planed. If size of the chunk is only 1 item, then in the worst case, we most late thread will finish BOM level coverage only 10 seconds after the most early thread  finished. If chunk size is 300 items, then in the worst case scenario, difference between time to process is 300*10Seconds-300*1Second==2700Seconds==45Minutes. It means that for this worst case scenario, there are good chances that most of  helper threads will be doing nothing, waiting for 30-45 minutes for the last, unlucky thread to finish processing. It would increase planning time and it also would lead to non-optimal usage of hardware. (Since most of the threads would do nothing, while waiting for end of BOM Level). From other side, allocation of the chunk to a thread is a competitive process, which leads to temporary database locks. Several threads often try to allocate the same chunks in parallel; Only one of these threads succeed, while others repeat allocation process until they grab their own chunk. So, chunk allocation can become a bottleneck itself if number of chunks is too high and size of the chunk is too small. To my experience, reasonable size of the chunk is somewhere between 10 and 60. To find out the optimum chunk size and distribution factor, you can simply make several test with different distribution factors. Also, you can check ‘Track Item Process Duration’ checkbox in MRP Dialog parameters and then check typical item planning time in Master Planning->Inquiries->Unfinished Scheduling Processes->Inquiries->Item Process Duration. If item process duration varies a lot, you can benefit from smaller chunk size; if it does not, then probably increase of chunk size can be more beneficial.

Another potential way to improve performance (I never tried it though) is to put randomly-ordered items into a chunk. Now, when the system creates chunks, it simply iterates over inventTable ordered by itemId, so a chunk contain items with similar itemId. Since, usually, items with the same complexity of planning often has sequential itemIds, it often leads to uneven of distribution of items between chunks. Some chunks consist from items which are regular purchased items, while other chunks consist of complex BOM-items, which require complex resource planning on many work center groups. If you add special MRPOrder field to inventtable, fill it with random number during item creation and then sort by this field during chunk generation, you can have more even distribution of items between chunks.

Infant mortality issue

The frequent problem of parallel MRP run is early termination of helper threads. Say, we started our MPR as usual, but then in 15 minutes, we see that all our helper batch tasks terminated withthe strange message “Nothing to process”. (You can see this message, if you click Log button in Batch Task form before main thread of MRP-batch terminate). Then the only remaining main thread continues to run MRP in single-thread mode (very slowly).
Here is what is happening:

  1. Main thread spawn helper threads and continues it work
  2. Helper threads continuously check status of process waiting for it to advance to “Deleting and inserting data”
  3. Main thread deletes net requirements and planned orders for plan being regenerated. (and it takes a lot of time usually)
  4. Helper threads, after waiting for 15 minutes for status update, terminates with “Nothing to process” message
  5. Main thread runs MRP in single thread mode.

First of all, the standard logic of MRP has obvious bug in helper thread behavior. While waiting for status update from main thread, the system keeps reading process status table (reqProcessList) continuously,without any pauses between executing  SELECT statements. It is easy to figure out, that, say, 16 threads, issuing 70-100 statements in a second will send to DB server about 1500 statements in a second. This amount of queries can easily consume  an average DB Server’s capacity, so it simply won’t have a time for useful work (like deletion of net requirements in main thread). To prevent the system from this behavior, you need to modify method run() of ReqProcessExternalThread class. It contains the ill-fated status waiting loop. You should find the end of while(true){} loop and
insert into it sleep(500); statement. This will add 0.5 seconds delay between re-reads of the process status, thus giving to your database server a time to breathe.

Another issue to fix is low timeout value for helper threads. If you have realistically large working set of net requirements, deletion of the plan would take much more time then 15 minutes, expected by developers of the functionality. To increase timeout, you should open class declaration part of the mentioned class (reqProcessExternalThread) and modify definition of WAITFORPROCESSSTATUSUPDATE macros. Try to change the value of macros from 15 to at least 60 or maybe even 90.

Slow end-of-level processing.

Often, you can see in the Unfinished Scheduling Process form, that during coverage phase, in the end of BOM Level he system works very slowly. It looks like it almost hanged for 30-40 minutes, but then, suddenly it advances to the new level. If you start AOS tracing on the batch server, you will see that all helper threads are doing nothing (they simply rereading process status waiting for advance to the next BOM level), while one unlucky thread is busy calculating coverage for portion of items. Also, in Unfinished Scheduling Process form, you can realize that the number of items on the level decreases ! 10 minutes ago you
had more then 15000 items on the level, now it is slightly less then 15000, and in a few minutes it will be less then 14500.

What’s happening ?

The main reason for the issue is incorrect BOM Level data in inventTable. When the system allocates items between chunks, every
chunk has attached BOM Level. But sometimes, during explosion of BOM for,say, 5th level, the system finds as a component of
exploded BOM the item, BOM Level of which (according to inventTable) is 2. Coverage for this item was already created
already. How does the system calculates coverage info for this item again ?

  1. The line for the item in ReqProcessItemListLine (It is a list of lines in a chunk) is marked as ‘IsLevelChanged’
  2. The system finds of create special ‘Spare’ chunk for items, which were pushed to the next level because of sudden level
    change. This chunk’s line in reqProcessItemList is marked as ’Spare’.
  3. The system reassign the line for the item in ReqProcessItemListLine to Spare Chunk
  4. During processing of next BOM Level, the system checks the item for being ‘pushed’ from level from level. If level for item was changed, the system deletes planned orders for item, previously created during planning on incorrect BOM Level.

The main performance issue of this approach is that the systemdoes not control the size of ‘Spare’ chunk at all. I witnessed
cases, when size of this chunk exceeded 1000 items. If planning of every item takes,say,4 seconds, then unlucky thread, which managed
to grab this chunk will spend more then 1 hour for processing of it. Since this chunk has highest number for a BOM level, it is usually
grabbed by some unlucky thread in the very end of BOM Level processing. Then one thread is processing this chunk for an hour, while all other threads are doing nothing, simply waiting for our unlucky thread.

The very first idea of how to fix the issue which came to the mind is to calculate BOM levels before every full regeneration planning. Unfortunately – it does not always help. Sometimes, we have incorrect BOMLevel info in inventTable only because we changed out BOM structure without recalculating BOM levels. But often, the reason for level change is that real life nesting of items in production BOMs does not fit to theoretical BOM structure described in master-data BOM. Say, we have a shining brass-head bolt, which is used only to screw a label with the name of our company to the final assembled good. Naturally, after calculation, it will get BOM Level of 1. Say, then, during production of a deeply nested sub-BOM of our finish good, someone decided to use the same bolt for screwing much smaller label to one of the sub-components. He dropped the line with standard bolt from production order’s BOM and added new line with the shining brass-head one.(Maybe it is not very good practice, but you simply can not update standard BOM structure for every possible small change, requested by a customer). If you recalculate BOM Levels, this bolt will still have BOMLevel 1. But if you try to run MRP, you would find out that it coverage actually is performed in level 5 or 6. By the way, one of the very last stages of MRP is BOMLevel update. During this stage, the system updates InventTable with actual BOM Level data, gathered during MRP processing, not with theoretical BOM Level from master-data BOM Structures.

So, the only way to resolve the issue, is to change behavior of the system to create many smaller Spare Chunks for a level, instead of one large Spare Chunk. To accomplish this, you need to modify method getSpareListNum() of ReqProcessItemList table:

static server ReqProcessListNum getSpareListNum(ReqProcessId        _processId,
                                                BOMLevel            _level,
                                                ReqProcessStatus    _status,
                                                Connection          _con
                                                )
{
    ReqProcessItemList          reqProcessItemList;

    ReqProcessItemListLine       reqProcessItemListLine;
    RandomGenerate              randomGenerate=new RandomGenerate();

    ;

    select firstonly ListNum, RecId from reqProcessItemList
    order by listNum desc
    where
        reqProcessItemList.ProcessId == _processId &&
        reqProcessItemList.Level     == _level     &&
        reqProcessItemList.Spare     == true;
    if (reqProcessItemList)
    {
        select count(recid)
        from reqProcessItemListLine
        where reqProcessItemListLine.ProcessId==_processId &&
              reqProcessItemListLine.ListNum==reqProcessItemList.ListNum;
    }
    if (!reqProcessItemList || (reqProcessItemListLine.RecId>=#MAXLISTSIZE)) //#MAXLISTSIZE is the macros with size of the spare chunk. Should be around 30-60
    {
        randomGenerate.parmSeed(timenow());
        try
        {
            reqProcessItemList=null;
            reqProcessItemList.setConnection(_con);

            select maxof(ListNum) from reqProcessItemList where reqProcessItemList.ProcessId == _processId;

            reqProcessItemList.ListNum=reqProcessItemList.ListNum+randomGenerate.randomInt(1,20);
            reqProcessItemList.ProcessId = _processId;
            reqProcessItemList.Level     = _level;
            reqProcessItemList.Status    = _status;
            reqProcessItemList.Spare     = true;

            reqProcessItemList.insert();
        }
        catch(Exception::DuplicateKeyException)
        {
            retry;
        }

    }

    return reqProcessItemList.ListNum;
}

The only strange thing in this piece of code is a usage of random number generator. I use it to decrease the number of potential conflicts, when several threads is trying to allocate new Spare Chunk, leading to duplicate key exception.

Leave a Comment

Parallel BOM calculation in RollUp 7

Posted Apr.29, 2011 by denisfed, under BOM, Infrastructure

Since Microsoft released most recent update of Dynamics AX (I mean SP1 rollup 7), I would like to point the readers of my blog to very important update in this version, which can dramatically increase performance of one of the critical, but slow operations from production module of Axapta.

I am talking about BOM Calculation process, which in previous version can only be started in single-thread mode. I heard about implementations, where BOM recalculation for all BOMs in the system were taking more then 2 days.

Somewhere in the last December, Microsoft released hotfix KB2487033, which supported multi-threaded execution of BOM calculation and now this feature included into RollUp7.

This nice feature works only if BOM Calculation is started not for a given ItemId, but for  all BOMs in the system (say – from costing version form) and if user specified batch-mode calculation of this batch job. In this case, the system behaves the following way:

  1. BOM levels are recalculated
  2. Tables bomCalcItemTask and bomCalcItemInventoryDimensionTask  are populated with item ids and item inventory dimensions to calculate. (These two steps are also executed in standalone, single-threaded mode of calculation).
  3. In single-threaded mode, the system starts to iterate over the items to calculate, starting from the lowest level components (with numerically highest BOM Level).
  4. In multi-threaded mode, system SPAWNS a certain number of helper tasks for EVERY BOM level to be calculated. It also set dependencies between tasks of every BOM Level, so jobs of BOM Level 7 starts only after tasks of BOM Levels 8 and 9 are completed, tasks of BOM Level 5 depends on tasks of BOM levels 9,8,7,6 and so on…
  5. Then in multi-threaded mode, system just calculates every BOM level in parallel. It is quite meditative experience - to look onto progress of tasks in you BOM Calculation Batch job. :)  It also saves you a lot of time. In my case, calculation time dropped from one night to something like 20-30 minutes, moreover 10 minutes of it, were taken for BOM Level recalculation (I guess, it is principally not parallelizable task by it nature).

Nevertheless, there are a few issues, which I would like to share here:

  1. The BOM Calculation process is not robust. If one of the BOMs has an error (quite typical if data were imported from heritage system), then on the first BOM with error, one of the BOM recalculation tasks will terminate. Then next task will try to recalculate the same BOM and terminate again, then next, and so on. As a result, usually, after first calculation you will find out that only 20-30 percent of you items has new prices. Solution for the issue is simple: You should simply put try {} except block somewhere inside  BOMCalcJob_All.runBOMCalculation() method, so error in BOM would not terminate the whole job.
  2. When system calculates the number of helper threads, which will be used for BOM Calculation, it is implicitly assumes, that all available batch servers will be used for the calculation. Say, if you have 3 batch servers and every server has 8 maximum batch threads specified in batch server configuration, the system will spawn 24 helper threads for every BOM level. First of all, the number itself is too big, normally 8-12 threads should be enough. Morever, if you going to use only one batch server (out of three) for given BOM Calculation batch group, these 24 helper threads will be executed in three steps (8 threads each). So, I recommend to modify BomCalcJobItemTaskBatchManager.numberOfBatchSessions() method to return more reasonable number of threads if you have more then 1 batch server or if your batch server has too many batch threads available.

Also, as kind of shameless self-promotion :) , I want to mention that for one of my project I made a clone of the feature, but the clone, which calculates in parallel only ONE BOM with all dependent sum-BOMs and items.

 

 

1 Comment

Ledger Balance Data in DAX

Posted Sep.26, 2010 by denisfed, under General Ledger

 Finances, General Ledger, History of Axapta, History of Dynamics AX, LedgerBalances, Locking

I remembering the first time when our team was developing a solution which was supposed to quickly calculate the GL balance for specific GL Account/Dimension combination. It was 2001 and we were developing the cost allocation functionality.

We just started to work with Axapta 2.1, so – our first approach was simply to calculate a GL Balance by summing up LedgerTrans table from the beginning of times . Soon, after we went live, we realized that the brute force approach is not always the most successful way of doing things in Axapta.

Luckily, we found the ledgerBalance and LedgerBalanceDim tables. First table contained the summed up GL turnover per GL Account+Period; Second table contained the summed up GL turnover per GL Account+GL Dimensions+Period. After we started to use these two tables for calculation GL balance from these two tables, we saw dramatic improvement in performance, since 200-300 of ledgerTrans records was packed into one record of these balance tables. (We had around 10-15 postings per GL Account daily and our GL period was a month).

Little bit later (like in the Fall 2001), we found out that these two tables is a more like part of problem than part of solution. :) Every time when you design some table which holds summed-up data for quick retrieval, one should remember that update of the data in the table will hold the lock on the table’s record until end of the transaction. So, If Axapta was implemented in the company, which did not have very detailed GL account sheet  (say only 3-4 GL accounts was used to hold the customer’s payment balance), posting of the multi-line payment journal could prevent everyone else from posting ANY other customer-related document for 5-10 minutes. Somewhere in the beginning of transaction, system was updating the balance data and then this data was locked until the end of transaction. In connection with mutual locking during inventory update, which I discussed in my article “History of Inventory Locking”, this had a rather catastrophic consequences for the system’s stability and performance. Couple of carefully planted orders/journals with the same GL accounts and inventory items, could make system unusable for a whole business day :) . The only remedy for this was the usage of batch servers for posting of nearly all documents, because batch server was naturally preventing any competition for resources by converting several potentially parallel processes into one sequential queue of processes. This approach could be a solution for accountancy-centric implementations, but any real time process, like WMS  just could not be fixed this way. That’s why I never believed into any of success stories telling about WMS implementation in Axapta made BEFORE release of Axapta 3.0.

The situation was relieved in Axapta 3.0.  These two tables was replaced with LedgerBalanceTrans/ledgerBalanceDimTrans tables. To prevent lock chains and deadlock during an update of balance data, developers of Axapta introduced two additional fields to these new tables: the transaction date and the funny named field ‘LedgerBalancesVariant’. First thing, we realized that now balance data is stored on per-date basis, but not per-period. Initially, the meaning of this Variant field was not clear for me. Soon, I found out that actually, system keeps up to 20 (TWENTY) GL balance records for the same date and same account/dimension combination. During update, system was RANDOMLY selecting a number from 1 to 20 and then updating/inserting balance record with arbitrary Variant number. (Actually – It was not complete random number. It was calculated as Variant= (SessionId mod 20), so update of the same account/dimension combinations from the same session went to the same balance record. If they were using true random() value, update from the same session and the same DB transaction would be scattered across several records even it was the only transaction for the combination in the day.) Thus, chances for long locks was greatly decreased. Even if GL account sheet is too coarse grained, probability for two users to update the same record at the same time now was like only 5%. When I first studied this concept, I had an idea that maybe after 300-400 concurent users, I would increase the number of these ‘variants’ to 30-40, but actually this idea proved to be wrong, since I never saw any serious lock contentions on ledger balance data since release of version 3.0. Of course, if someone were to monitor process locking in MS SQL, he/she would see occasional waiting locks on ledgerBalanceTrans/ledgerBalanceDimTrans from time to time, but generally – balance data had ceased to be the main bottleneck of system’s performance.

From the other side – this solution actually traded update performance for for query performance. After balance data have been scattered across 20 variants and stored on the per-date basis, size of ledger balance table became comparable to the size of the ledger transactions table itself. Typically, size of LedgerBalanceDimTrans table became to be around 50-70% of ledgerTrans table. Since then, usage of ledgerBalanceDimTrans instead of ledgerTrans for reporting purposes became unnecessary complication, since query time for the both cases was comparable, but ledger balance data sometimes lacked necessary details important for some particular reports. Of course, in may cases usage of ledgerBalanceDimTrans was little bit more convenient then direct usage of ledgerTrans data, since one record of the balance table holds both debit and credit turnover, while fetching the corresponding data from ledgerTrans with ‘group by crediting’ would give us debit and credit turnovers in different records.

When I first saw ledger balance tables in Dynamics AX 2009 I was very surprised to see that field Variant had gone. First of all I had not expected any changes in these area since it had became non-problematic since 2003. Second – I saw no way for this mechanism to exists without Variant field.

Actually, it turns out that designers of financial subsystem in Axapta realized that actual results from usage of this Variant algorithm was satisfactory, but not perfect. From one side, we lost almost all of our DB space/query time savings from ledgerBalance* usage, from the other side – we still can have a lock conflicts on ledger balance data updates. (Though less frequent). So the designers decided to store ONE balance record per ONE ledger voucher. In the worst case (if we have post mostly one-line documents) we will have the number of ledgerBalance* records equal to the number of ledgerTrans records. In the realistic case (we post at least 50-70% of multi-line documents) we will have the same number of records as in Variant-mechanism. Since now these balance data is not a query performance increase mechanism anyway since 2003, actual size of ledgerBalances* tables does not matter; It enough for them to not outgrow ledgerTrans itself. From the other side, since EVERY ledger voucher now has it own balance record, there is no chance of lock conflicts at all.

One interesting point which is worth to mention about the new mechanism is the way how these balance updates are being made now:

  1. All updated balances made in one ledgerVoucher is kept in RecordSortedList (thus – in memory) somewhere in LedgerBalancesList/LedgerBalancesPostingList class.
  2. As soon as posting of ledgerVocher has finished, or the list became to large to hold in memory it is being flushed to quasi-temporary table LedgerBalancesTransDelta.
  3. In the end of ledgerVoucher, system opens separate additional database connection and use this separate connection to execute insert_recordset statement which insert new balance records by summing up ledgerBalancesTransDelta. This update use optimistic locking mechanism (with transaction retry), but since this is separate transaction with a very little of actual database update, it can be rolledback and retried with much overhead. (And this is exact reason to use separate transaction for this update! If we were to use the main database transaction, in case of lock conflict, this would cause a great overhead (Imagine 20000 lines FA depreciation journal, which was rolled back and posted again because of balance data lock conflicts).
  4. Finally, the records belonging to current session are being deleted from ledgerBalancesTransDelta. (That is why this table called quasi temporary).

Although, as I said, I have never seen any actual implementations of Dynamics AX 3-4 which really suffered from ledger balance data related bottlenecks, It might be that these problems would arise in 3000-4000 user implementation. (Never have seen personally implementations with more then 700-800 users). Overall, the whole history of changing approach to balance data in DAX looks like a good example of software design and engineering.

1 Comment

Tracing in Dynamics AX 4 and 2009

Posted Mar.10, 2010 by admin, under Infrastructure

 Performance; Trace parser; DAX Kernel;Tracing;

I’ve spent considerable time trying to figure out how tracing works in newer versions of Dynamics AX. Although manual for Trace Parsers seems to be quite satisfactory on topics related to analysis of collected trace, it is pretty terse on the topic of trace collection and setting up; Dynamics AX Performance Team log has two very useful posts on trace parser (http://blogs.msdn.com/axperf/archive/2008/07/18/traceparser-video-training-part-i-installation.aspx and http://blogs.msdn.com/axperf/archive/2008/07/30/traceparser-video-training-part-ii-configuration-collection-and-importing.aspx), but it seems that these trainings cover some early beta of trace parser tool and prerelease version of DAX2009. So, I decided to write small article about my findings related to trace parsing in Dynamics AX.

To have a full picture of tracing subsystem in DAX, keep in mind that actually DAX supports THREE kind of tracing:

  1. SQL Tracing
  2. Server Tracing
  3. Client tracing

SQL Tracing

This kind of tracing was supported from the earliest version of Axapta, I’ve met. Even in Axapta 2.1 it was possible to have a log of SQL Statements issued by your application, by checking appropriate checkboxes in SQL tab of user settings. I won’ describe these settings in detail, You’d better play with them to understand. I only want to mention two SQL tracing-related peculiarities, introduced in version 4.0:

  1. SQL Tracing work ONLY if your AOS configuration has “Allow client tracing on Application Server Instance” setting checked. In versions 2 and 3, SQL trace worked independent from any server configuration settings, so I was  little surprised when I tried to run familiar SQL trace on DAX4.0 without any results. It seems,that in attempt to gain couple of extra percents of performance, DAX kernel’s team decided to make this SQL tracing globally switchable starting from version 4.0.
  2. If you are trying to log SQL Trace into file, expect to find this file ON SERVER SIDE. In versions 2 and 3, SQL trace file was being stored on local log directory. Starting from version 4.0, SQL trace file is being stored on Server side log directory. Probably, this is because performance team decided to completely decouple tracing process from client. Before attempt to commence SQL Tracing to file, please ensure that your AOS user (I mean – AD user, under which account AOS is run) has enough rights to create and write files in log directory. (Normally, It is something like C:\Program Files\Microsoft Dynamics AX\50\Server\DynamicsAx\Log).

Also, Normally results of SQL Tracing is stored in SQL table, browsable from Axapta itself. The only reason for usage of SQL Tracing to log-file is a special request from Microsoft support team (because it is much easier to e-mail a log-file than log-table).

Server tracing

The most useful way of tracing to my experience. First of all, I want to mention that this tracing module was introduced in Dynamics AX 4.0 and it heavily depends on Events Tracing for Windows subsystem (described in details in http://msdn.microsoft.com/en-us/library/aa363787(VS.85).aspx) So, if You have some problems with starting server trace, You should check that your AOS Server has appropriate Trace Provider installed. You can do it by running “logman query providers” command in AOS’ command prompt. Resulting output must have a line starting with “AOS Trace”. If your server does not have appropriate query provider, You can try to fix this by going into server’s BIN directory and running ‘mofcomp aos_trace.mof’ command. If your server has UAC turned on, this command must be run from command prompt with elevated privileges.

Tracing parameters can be configured on the ‘Tracing’ tab of Dynamics AX server configuration utility. I won’t go into detail about every checkbox and number on this tab,coz they are described in reasonable details in trace parser’s manual. The most funny thing about server tracing is that important-looking ‘Start Trace’ and ‘Stop Trace’ buttons actually DO NOT start a trace at all. They only change the value of Tracestart parameter in registry to 1 or to 0 respectively.

Somewhere in the heart of AOS’ server process, there is a separate thread which wakes up every second and check the value of TraceStart parameter in the registry. If the value of this parameter has changed from 0 to 1, this thread reads all other tracing related parameters from the registry (these parameters described in more details on Trace Parser’s manual) and starts the trace. If the value of this parameter has changed from 1to 0, this thread stop the trace.

So – If you want to start trace on live system, You need not restart AOS Service. You can just set necessary tracing parameters, then press Apply button to store them into registry (do not agree to restart service if Configuration Utility will try to propose it) and then just press ‘Start Trace” to set TraceStart Parameter value to 1. AOS will check the registry for changes and start tracing just in  few seconds.

The very first time, when I tried to run server trace I’ve faced the situation when my log directory, where I had expected to see trace file, was empty and my event log had rather cryptic error message from AOS Server:”Fail to start ETW tracing session: return code is 5”. After spending several hours trying to reinstall and reconfigure everything in my system, I figured out, that probably tracing is started with standard StartTrace WinAPI function, so 5 is an error code described in detail in the MSDN’s description of this function. Actually, this description is quite useful when You have some strange error message from Axapta’s ETW tool and You are trying to figure out what is happening. Moreover, this article has detailed description of all these in/max buffer and buffer size registry parameters, which, according to Trace Parser’s manual You should set before actual attempt to collect the trace.

From reading of the mentioned article, I comprehend two important rule related to security and ETW in Dynamics AX:

  1. The AOS’ user must belong to the ‘Performance Log Users’ user group if it is running on Windows Server 2003 or 2008. WMI event tracing subsystem requires that tracing can be started ONLY from user belonging to the user group or to Administrators user group. Since the idea of including AOS’ user into Admin group sounds completely wrong, the only solution is to include AOS’ user into ‘Performance Log Users’ group.
  2. The AOS’ user must have full access rights to server’s log directory.

Once, I also saw the ”Fail to start ETW tracing session: return code is 112” error message. It was caused by incorrect setting of tracefilesize in Dynamics AX configuration in registry. By default, this setting is set to 10000. So, on attempt to start tracing, WMI Event tracing subsystem checks that target drive for log file has 10000 MBytes available (10Gbytes actually). If available space is less than 10Gbytes, attempt to start tracing fails with the error message mentioned. I recommend to set this setting to something like 1000, because actually Trace Parser tools tends to choke on attempt to import event file larger than several hundreds megabytes.

A Trace log file, with the results of tracing session is created in server’s log directory and has binary format. Result of the tracing can be analyzed via usage of Trace Parser tool, available somewhere at Microsoft’s PartnerSource site.

Another small hint: First time then You try to work with server trace in trace parser tool, do not forget to select correct user in a small window in the upper left corner of the trace form. It seems obvious, but on my first try I spend couple of minutes trying to comprehend relation between what I had been doing in Axapta and what I saw on a screen. Somehow, It did not occur to me immediately that server tracing actually collect trace for ALL current user on a server.

Also – Keep in mind that if server has around 150-200 users, then server trace will produce around 150-300 MBytes of trace data every minute, so do not keep tracing your server longer than necessary.

Now, I want to discuss, why I consider Server Tracing extremely important for every DAX implementation technician. Most of developers heard that this tracing can be used for performance profiling of application. It is true. But actually, there is much more interesting application of Sever Tracing. You can treat it as a magical almost non-intrusive flight recorder which can be used to pip into already running application which suddenly went crazy and you do not have an idea what is happening. Say – You have a call from end-user, who complained that scheduling for one production order already took 20 minutes and still is running, while normally it take no more than a minute. You must try to start server tracing ASAP! Even if this scheduling finish in a minute or so, You hopefully would have enough trace data to analyze the situation later on.

As a conclusion: I consider Server Trace as a primary tool for nailing down floating errors, which does not have clear reproduction scenario by now.

Client tracing

First of all: You will be surprised, but Client Tracing just does not work in default installation. At least, all dynamics ax installations, I’ve checked does not have appropriate trace provider installed. I do not know, whatever it is an error or feature of Dynamics Installer. Before attempt to start client tracing, please run ‘logman query providers’ command and check whatever the command’s output has line starting with “AX Trace” in it. If it is not the case, first thing You should install AX Trace provider by running mofcomp ax_trace.mof from DAX Client’s bin directory. (In case of Vista/W7/WS2008, mofcomp must be run with elevated privileges).

The approach used for starting client tracing is similar to the approach used to start server tracing. Client periodically (I think – in the main message loop) checks the value of TraceStart parameter in the corresponding section of registry. If the value of the parameter has changed from 0 to 1, client starts tracing and so on.

So – The basic approach to starting/stopping tracing on a client side is the same as in case of server side tracing.

Here are several hints on Client Tracing.

  1. Client tracing only works if Axapta’s client is started with Administrative privileges. This requirement arises from usage of StartTrace function. I believe, If you include yourself into “Performance Log Users”, it would achieve the same result, but I never have tried this. Actually, when I need to run client trace I simply start DAX Client in “Run as Administrator” mode.
  2. FIles with client trace are located in Client’s log directory. In default configuration it is set to \Users\Public\Microsoft\Dynamics Ax\Log\ or \Document and Settings\All Users\Microsoft\Dynamics Ax\Log\.
  3. As I said – You can start tracing at any time, by pressing “Start Trace” button in client configuration utility. Since the client checks registry settings less regularly than server, expect some delay between pressing the button and actual start of client tracing.

In a conclusion, I want to mention that client tracing is the most useless kind of a tracing in Axapta. First of all, normally most of business logic is performed on a server side (so no much to see on a client side trace). Second, client trace does not contain information about performed SQL statements at all. (Since statements are always executed on server side).

8 Comments

Cost Explorer in DAX2009

Posted Nov.23, 2009 by admin, under Logistics

 Costing

It seems that not every Axapta user/consultant is aware about the powerful debugging tool for cost related issues, introduced in DAX2009. This tool is a Cost Explorer form, which is available from Inventory->Cost Explorer button in any of different Inventory Transactions forms. Basically, this form allows you to trace down to the deepest cost sources of any inventory transaction. The most important difference between this form and the Trace form, available starting from ancient versions of Axapta, is that the Cost Explorer form fetches tracing information from the Inventory Settlement data, while the Trace form just settle issue to receipts on the fly. This makes the Cost Explorer form to be a powerful cost debugging tool, which allows you to deconstruct a cost of any inventory transaction. Say – You open the form for closed inventory transaction for sales order. After expanding the cost sources for this transaction, you see that it cost originated from two receipt transactions – one is a production order and another is a transfer order from a different warehouse. By expanding receipt node for the transfer, you find that this transfer cost is originating in a production order, then that production order’s cost is originating from three purchases and one transfer order, which,in turn, originates from fourth purchase order. So, after considerable number of mouse clicks, You would find all the originating inventory transactions and understand the source for every cent of the cost price of the issue in question. Here is little more details about the form:
1. The form can be called for receipt and issue inventory transactions.
2. For an inventory receipt, the form can show not only transaction data itself, but also all the adjustments made for the receipt. The cost structure engine can even differentiate between kinds of adjustment (On Hand Adjustment/Transaction Adjustment, Rounding during inventory closing, Standard Cost adjustment, Other types of adjustment).
3. For a receipt from production order, even data from route transactions (essentially – non-material part of cost) are displayed.
4. For an inventory transaction, the cost tree node text shows Item No, financial inventory dimension, Item Name, Transaction Type, Order number (like PO’s number, Inventory journal’s Number), Settled Qty from this transaction, cost contribution per settled piece (basically – settledValue/settledQty), Settled Value.
5. For a receipt adjustment, the cost tree node text shows type of adjustment, settled Qty from transaction adjusted,cost contribution per settled piece (basically – adjustmentValue/settledQty), Adjustment Value.
6. The form displays the adjustments made ONLY before the date of the last inventory closing. If you want to analyze ALL the adjustments for given receipt transaction, You should not rely on this info. To analyze all the adjustments – just open the Settlements form for given inventory transaction.

When I am talking about Settled Qty, I mean “Quantity contribution” of this transaction into the top-level transaction, for which the Cost Explorer form has been started. Say – We have a purchase order for 30 pieces of raw material, production order, which consumes all 30 pieces of raw material and produces 6 pieces of finish good and we have sales order, which sells 2 pieces of finish goods. So – If we try to trace the cost structure down from sales order to purchase order, then for settlements from purchase order receipts to production order issues, we would see Settlement Quantity of 10 (despite the fact, that all 30 pieces from receipt has been settled to issues during the last inventory closing). This is because we are analyzing the cost structure from Sales Order, which has sold only 1/3 of finish goods. So – Only 10 pieces of original purchase have contributed into COGS.
It should be mentioned also, that contributed cost value from some receipts can be LESSER then 1 cent (eurocent, kopek, etc). This is perfectly normal. If the cost value of the issue in question is not equal to the total of originating receipts contributions into cost, just try to increase number of decimals on the Setup tab of the form.

Finally, I want to mention, that although this form works perfectly well, it leaves overall impression of a kludge made to quickly fix the problem. The cost data is being gathered through too many sources; Approach for cost structure display is not aligned with cost group and cost breakdown mechanisms introduced in DAX2009.
I hope that in some of the future versions, Axapta would introduce some high level engine for cost structure analysis and manipulation, which would disconnect cost data from all this inventory transactions, inventory settlements, indirect cost transactions, route transactions, markup transactions and so on. It would be a great advantage for the costing process in Axapta.

Leave a Comment

Update to the IC article

Posted Nov.02, 2009 by admin, under Inventory costing, Logistics

Section on Inventory closing for Physical inventory transactions is just added.

Leave a Comment

History of inventory locking in DAX

Posted Sep.28, 2009 by admin, under Logistics

 Inventory, Locking, Performance

I’ve just translated and published my old article on the history of the inventory data locking in Dynamics AX. Well – It was written as a kind of sales driver for Dynamics AX 4.0, because improvement in inventory locking was the only actually useful improvement of DAX 4.0 against DAX 3.0. :) Maybe this article is not that actual by now, because most of customers has moved to DAX 4.0 already, but it can be useful for developers and architects to understand the history and reasoning for the current approach to inventory on-hand data locking and update…

2 Comments

Costing and inventory closing

Posted Sep.22, 2009 by admin, under Inventory costing, Logistics

 Costing, FIFO, Inventory closing, LIFO, Standard Cost

Article on Inventory closing and costing has been published. It is a translation from Russian original, first published 2+ years ago in my old blog (http://blogs.technet.com/denisfed/). The translation generally retained info from the old article, but has been cleaned from some peculiarities of Russian accounting and Russian localization of Dynamics AX.

Leave a Comment

Search

Categories

  • BOM (2)
    • BOM Level (1)
  • General Ledger (1)
  • Infrastructure (3)
  • Logistics (4)
    • Inventory costing (2)
  • MRP (1)
  • Non Technical (1)
  • Performance (1)

Archives

  • January 2012 (1)
  • November 2011 (1)
  • April 2011 (1)
  • September 2010 (1)
  • March 2010 (1)
  • November 2009 (2)
  • September 2009 (2)

BlogRoll

  • AX Coder
  • Dynamics AX Tools and Tutorials
  • Dynamics AX Training
 
Powered by WordPress. Sky theme ported by mywpthemesite.com