Simple log file processing in Python
The other day I found myself in the unfortunate position of needing to scan through raw server logs to try and gather some information around a rare issue. Opening these log files in a text editor and doing a quick text search wasn't a great option: the log files had millions of log lines, were 500MB+ in size, and the text editors just gave up trying to search, multi-select, and extract the lines I needed.
I've recently gotten into Python (initially as a requirement for a project at work), and while I still have a lot to learn, I have found it to be amazing tool for scripting out quick little solutions to annoying problems. Like debugging server logs.
A couple minutes, and 22 lines of python later: I had taken a few million lines of server logs, and extracted the ~50 or so messages that were relevant. So I decided to take a few extra minutes and publish this post to encourage others to give Python a shot, with an example (of a pretty common) use case.
parse_logs.py:
import os
import re
# Regex used to match relevant loglines (in this case, a specific IP address)
line_regex = re.compile(r".*fwd=\"12.34.56.78\".*__aSyNcId_<_UFljrERm__quot;)
# Output file, where the matched loglines will be copied to
output_filename = os.path.normpath("output/parsed_lines.log")
# Overwrites the file, ensure we're starting out with a blank file
with open(output_filename, "w") as out_file:
out_file.write("")
# Open output file in 'append' mode
with open(output_filename, "a") as out_file:
# Open input file in 'read' mode
with open("test_log.log", "r") as in_file:
# Loop over each log line
for line in in_file:
# If log line matches our regex, print to console, and output file
if (line_regex.search(line)):
print line
out_file.write(line)
If you'd like to see this example running on your own machine:
- Make sure you have Python installed
- Clone this example from our github here: https://github.com/codehangar/python-log-parse-example
- Run
python parse_logs.py
Example Run: