With the need to break lines into smaller tokens for a large number of logs for my malware research, I evaluated the performance of Java’s StringTokenizer, Pattern and Scanner.
Evaluation criterion: x-axis is the number of tokens to be broken in a string. Each string was broken down 1000 times.
Instances of Java StringTokenizer and Scanner, were created dynamically; whereas a single instance of Pattern was created. However, Pattern.split() creates a new array in memory and the time taken is included in this performance.

Conclusion: Use StringTokenizer whenever possible to improve performance. Where more flexibility is needed use other constructs.
Comments
Leave a comment Trackback