We’ve recently become involved in a few projects addressing performance and scaling issues for products and systems built on top of Dynamics CRM, so we thought we’d share a few major areas of concern when designing and building highly scalable / high performance Dynamics CRM systems.
The key to any successful highly scalable, high performance solution is mostly in the design. A poor design on the fastest hardware in the world is still going to fail. Each performance critical component of the system must be designed to scale up (perform better with additional hardware capability) and scale out (perform well when distributed across multiple machines). This starts with simple principles, such as: reducing chattiness and dependencies on external systems; proper use of caching throughout the application and its integrations; coding best practices; and efficient use of CPU cycles, disk and memory. But it also involves efficient implementation of business requirements and a conversation with stakeholders about requirements that will and won’t scale. Without an eye focusing on extreme scale, it is all too easy for developers to design inefficient systems. In terms of scaling out, the easiest rule of thumb is to insist on a stateless design at all layers of the application. This will enable an easy scale-out of application servers behind a simple load balancer.
[box] THE IMPORTANCE OF DESIGN IN SCALABILITY
On an extreme scale project, a series of poorly designed CRM plug-ins resulted in a single business operation (creating a customer ticket) performing 20 to 30 additional database operations. In production, the system was handling 12,000 to 15,000 tickets per day, which resulted in as many as with 450K database operations a day. After code assessment and re-design, the exact equivalent business logic was able to be implemented with only 7 database operations, cutting the database load by over 400%. [/box]
Microsoft SQL Server Hardware
In our experience, the SQL database is almost always a bottleneck in most projects. Unfortunately, the SQL server is also the only component of the system that can only be scaled up, and not out. Microsoft CRM is not able to be “application aware” of advanced Microsoft SQL Server scaling techniques (such as secondary read replicas). This means that for SQL server scaling, the first line of defense is efficient design and code, followed by as much hardware as you can afford. Microsoft CRM also will benefit greatly from proper SQL server configuration, proper indexing, page and row compression and a handful of additional optimizations.
Our experience has taught us that integration is often a spot for performance bottlenecks, especially when integration transactions are time sensitive. It can be difficult to develop multi-threaded ETL solutions that are able to efficiently parallelize tasks. In addition, ETL developers commonly fail to make efficient use of caching during the transformation. Finally, tuning of .NET HTTP connection limits will be required to allow a properly built multi-threaded ETL solution to access more than 2 HTTP connections at a time.
Because ETL solutions can quickly load a massive amount of records into CRM in a compressed time period, designers should take care to consider what business logic that might normally reside in CRM should take place in either the transform process or the staging process, whichever is more efficient. Suppose that when an order is received by the CRM system, it automatically generates a service code, delivery date, and performs some data validation to protect against bad user input. These processes can be slow inside the system when compared to performing them as either part of the ETL or in a pre-import layer. Replicating the processes before importing into CRM will result in faster processing times and eliminate unnecessary computing time and database calls. In the event that these functions also need to fire from within CRM, the processing code can either be called from CRM, or (less optimally) the business logic can be replicated both within and without CRM.
Finally, in extreme cases, it may be necessary to bypass CRM’s SDK and do direct database inserts. While this is considered “unsupported”, with proper guidance and care, it is possible and can be done for narrowly scoped situations. This can (as a last resort) enable extremely fast direct SQL bulk operations.
Infrastructure considerations go above and beyond the individual hardware thrown at any one server. One must consider the network between machines and the overall hardware strategy of the IT team. First, when considering an application that will support millions of transactions per day, a very heavy load will be levied on the network infrastructure. Fast machines that are not able to talk to each other quickly are going to be as slow as the weakest link. While chattiness reduction, TCP/IP tweaks, HTTP compression and other techniques can improve performance on the wire, a properly configured and isolated network can be an essential ingredient in a high performance system.
Another area to consider is virtualization. Virtualization is an excellent technology and we fully embrace and encourage it. However, virtualization introduces its own issues when it comes to extreme scale. We would highly recommend against virtualizing the SQL backend or any extreme scale-up components of the solution. The web front-ends however can be virtualized, but special care will need to be taken to deal with specific vendor limitations. As an example, VMWare has very special considerations that need to be taken into account when virtualizing multi-core machines. In addition, even simple things such as CPU power management may have dramatic effects on the performance of virtualized machines.