Following is some analysis conducted by Vamsi Kundenti. Find this info here.

I reproduce this here just in case the original website goes down.

<–Start –>

“I happened to look at the algorithmic details of UNIX Sort, a LINUX version of the classic UNIX sort is a part of GNU coreutils-6.9.90. This is classic example of the standard External R-Way merge , to sort a data of size N bytes with a main memory size of M so it creates N/M runs and merges R at a time, the number of passes through the data is log(N/M)/log(R) passes.In fact the lower bound(runtime) for external sorting is Ω((N/M)log(N/M)/log(R)). All the external memory sorting algorithms provided in the literature are optimal so the fight here is minimizing the constant before the number of passes.

UNIX sort treats keys are lines (strings), the algorithm followed by unix sort is in fact the R-Way merge. Let the input file size be IN_SIZE.
1. Choosing Run Size:
——————————–
The sizes of the initial runs are chosen from the total physical memory (TOTAL_PHY) and available memory (AVAIL_PHY). RUN_SIZE = (MAX(TOTAL_PHY/8,AVAIL_PHY))/2
maximum of 1/8th of TOTAL_PHY and AVAIL_PHY and divided by 2. See function “default_sort_size (void)” in the code.
2. Creating Runs:
————————-
Unix sort creates a temporary file for every run. So it creates IN_SIZE/RUN_SIZE (celing) temporary files. Internally it uses merge sort to sort internally it uses an optimization mentioned in Knuth volume 3 (2nd edition), problem 5.2.4-23.
3. Merging:
—————-
The number of runs merged at any time is hard coded in the program see macro NMERGE , NMERGE is defined to be 16 so it merges exactly 16 runs at any time.”

<–End–>