Somebody pointed out Grant Holiday’s blog to me a few weeks ago and mentioned that it had not been updated in a while.
I thought it might be worth having a go at providing an update to it.

A bit of MS history

Grant wrote his article “What does a well maintained TFS server look like” back in 2013 and it is possibly one of the most referenced TFS articles out there.
It contains information on all areas that matter to a TFS environment, including services that usually integrate with TFS.

Maintaining a TFS environment is not always an easy job, but some things have become easier since Grant’s article was written:

  • Retention Policies have been introduced consistently, doing away with the need for the Test Attachment Cleaner and some other Power Tools
  • Configuring a new TFS environment has become a lot easier, and the configuration assistant will now optimise your data tier and other services, so that your TFS environment runs better from the start
  • Many features that were previously deployed as “additional services” (SharePoint, Reporting, Project Server) have now been integrated with TFS directly, reducing the complexity of deployments

Run in several environments

  • You should have a test environment where you can test new security updates, and feature updates
    • This environment should mirror your prod environment as closely as possible
  • The application of updates should be as automated as possible
  • You should think about creating automated tests for all major TFS workflows, to check if an update breaks any important business processes without having to conduct a manual test cycle

Be Current

  • Always stay close to the latest TFS version
    • from TFS 2018, TFS releases yearly with 2 stability and 1 feature update each year. (a total of 4 yearly releases)
  • Be on the latest SQL Server Service Pack and stay on top of major SQL Server updates
    • TFS will usually support the latest two versions of SQL
    • TFS 2018 supports SQL 2016 and 2017
  • Be on the latest version of Windows Server
  • Apply security fixes as quickly as you can
  • Apply Windows Updates regularly

Configuring TFS

  • Check the installation requirements
  • Configure anti virus exclusions
    • This has not changed in recent versions of TFS
  • Ensure firewall rules are correct
  • Unless you want to run in WORKGROUP mode, do not configure TFS as a local account, but as a domain account
  • Plan for the size of your deployment
    • Do you need a highly available application tier?
    • Do you need a clustered SQL tier?
    • How easy is it to scale out with your current architecture design?

Configuring SQL

  • Know the capabilities of your hardware
    • Take a benchmark and check if your hardware can support the size of your deployment
    • Work with 64k disk allocation for the best performance
  • User different volumes and disks for
    • SQL data
    • SQL logs
    • TempDBs
    • System DB
  • Use more than one TempDB
    • Preferably these should be on separate disks
    • If you have less than 8 cores, you should have the same number of TempDBs as cores.
    • If you have more, start with 8 and add sets of 4 as needed.
  • Consider using the following SQL trace flags. (These should always be tested extensively first and need to make sense for your scenario)
    • T4199 Query Processing Optimisation
      • improve performance, increase CPU/memory usage
    • T1211 Prevent Table Lock Escalation
      • increase CPU/memory usage
    • T1118 Reduce TempDB contention
      • improve processing speed, increase memory usage
    • T1222 Generate XML deadlock graphs
      • improve reporting & troubleshooting experience, increase CPU/memory usage
    • T1117 Assure equal auto growth for multiple Temp DBs
      • you can achieve this manually by pregrowing TempDB
      • increased disk/CPU/memory usage
    • T3226 Suppress Backup Information events
      • reduce event log size
      • reduce CPU usage
      • less information in event log
  • Configure SQL server error log rollover
  • Set an appropriate “Max Server Memory” value to make sure your SQL server does not fall over when it is most needed.
    • Reserve at least 1 GB of RAM for the OS per 4 GB of RAM installed
    • If you have more than 16GB of RAM installed, leave an extra 1GB for every 8GB
    • On a one-box-setup, where all services are on the same machine, you will need to leave more for IIS, TFS, and other services
  • If your version of SQL server supports it, you can switch on Transparent Data Encryption, to have your TFS data encrypted at REST

SQL Maintenance

  • Grow your databases in 1GB increments rather than at a percentage
    • If the autogrowth is set to a % a large checkin can potentially have an adverse effect on the TFS performance
  • Rightsize your databases and stay on top of their growth
    • Do this with your TempDBs too
  • Monitor long running transactions
  • Monitor table size and row counts
  • Monitor the error log
  • Take regular backups and use marked transactions
  • Check the database for corruption regularly

TFS Optimisation

  • Consider running in a highly available configuration
  • Consider using SQL Server Page Compression and ensure that you have plenty of free space on your data tier
  • Monitor the IIS log, but make sure you configure a rollover for it
    • Otherwise it will grow indefinitely
  • Move the Application Tier cache to a logical drive and dedicate it to the cache
  • Enable SMTP settings
  • Disable the IIS App Pool Idle Timeout

TFS Administration

  • Configure a retention policy for all your assets
    • This replaces the Test Attachment Cleaner
  • Monitor Execution time of calls to the database
  • Review the activity log and job monitoring console at http://myserver:8080/tfs/_oi
  • Clean-up unused work item tracking fields witadmin listfields /unused
  • Reduce usage of and manage the size of
    • global lists
    • team fields
    • constants
  • Evaluate work item tracking fields that are set to reportingtype=’dimension’. Do they really need to be in the cube?
    • If not the reporting type can be changed to “detail”
  • Evaluate if you have custom work item tracking fields that are used in many work item queries and would benefit from being indexed. (witadmin indexfield /index:on)
  • Check tbl_EventSubscriptions for invalid email and SOAP subscriptions. Review the “Notification” settings for you, each team, and other users. You can find the view by hovering over your portrait in TFS.
  • When adding custom fields, make sure they are necessary and that the changes support the teams’ process rather than impede it or make it more difficult

Build Maintenance

  • Monitor available disk space on your build machines
    • In TFS 2017 and newer you can configure build agents to clean after themselves.
    • This option is available on the Pools/Queues panel.
  • Monitor average queue times and add additional agents to your pools if necessary
  • Be cautious about having more than one agent on a build machine
    • There are good reasons to do so, but note that some build tools can only be used by one process at a time
  • Use a package management solution
    • Either in TFS or a private NuGet repository
  • Stage your build artifacts to TFS to enable a smooth handover to Release Management
  • Configure your CI builds to be as quick as possible
    • for example: do not pull code that is not required for the build

Release Maintenance

  • Make sure your retention policy is configured, so that builds are maintained for as long as releases are maintained
  • Use different agents for prod and test environments and use queue/pool roles to restrict access to the agent queues, if necessary
  • Take advantage of automated testing
  • Use “PowerShell on Target Machines” if you can for a more resilient WinRM deployment experience
  • Monitor average queue times and add additional agents to your pools if necessary

Additional Services

  • SharePoint
    • this has not been updated significantly since 2013
    • the integration is no longer available in TFS 2018 and newer
    • use Git Large File storage, shares, and package management to replace the storage features
    • use TFS Web Access Reporting to meet the reporting requirement
  • Reporting Services
    • this has not been updated significantly since 2013
    • this could be a blocker, if you are intending a move to VSTS
    • consider using TFS Web Access for lightweight reporting
    • use the REST API to create custom reports
    • build your own dashboard widgets
  • Project Server
    • this is not supported in TFS 2017 or newer
    • support is available from partners if you truly need Project Server
    • most teams will be better off in the long term by moving to a truly agile process, where Project Server is less important
  • XAML build services
    • XMAL build services are no longer supported in TFS 2018 and newer
    • You should consider moving to vNext builds as soon as possible
    • Existing XAML builds should be managed as technical debt
    • There is no automated retention policy for XAML build agents and controllers. You need to create one yourself or clear the build content out on a regular basis.