Regular Expression Cheat Sheet

Regular expression is very often used in NLP. I usually use it to find well-defined patterns in texts. There are lots of symbols and rules in regex. Here I tabulate some frequently used symbols which I always need to look up online. Selected from this webpage. Character Legend \d one digit from 0 to 9 […]

PDF to TXT

I’ve procrastinated for so long. Let me write about what I’ve done so far. The first step I did was to convert the SANDAG public comments from pdf to csv. Although there are some open source tools for pdf to csv conversion tasks, the file I’m converting exceeds the file size limits. Therefore, I chose […]