Ever wonder how interpreted languages like Python work?

I, like many people, found my way into Data Science circuitously via the physical engineering fields (Naval Architecture & Ocean Engineering in particular). As a result, I never took any formal Computer Science courses. As my love for computing and programming has grown over the years, I have always wanted to learn more about how computers work, and specifically about how a dynamic language like Python is actually implemented at its core.

While I know not everyone wants to dive so deeply into the design of languages itself, I found Crafting Interpreters to be an amazing resource. The author, Robert Nystrom, is the creator of the Dart programming language at Google. While the language developed throughout the book is unique to the book as a learning language, its implementation is quite similar to CPython, at least at the foundational level. If you work through the book, you implement two types of interpreters, the latter being a bytecode interpreter written in C (just like CPython).

I cannot recommend this book highly enough for anyone who is interested in diving deeper into the topic of how programming languages are implemented. And if you are curious in particular about CPython itself, Anthony Shaw’s book CPython Internals is also fantastic, albeit more about the practical implementation of CPython than about ground-up implementation.

1 Like

I like the way this is structured with real code examples. I’d recommend book this to any student currently taking a compilers course, just because some of the books/instructors can get so heavy into theory that it’s hard to bridge the gap into practical implementation. The problems that are solved along the way to building a fully-fledged compiler are often applicable in many other domains. A lot of devs end up building things like simple AST parsers for a higher level query functionality or some kind of expressive templating, and understanding basics like BNFs and context free grammar concepts made it really easy to either 1) roll their own parsers or 2) make use of existing parsers (flex/bison anyone?).

This is very true. Even just learning how simple data structures like hash maps (python dictionaries) and linked lists work has been beneficial for me. And you are spot on with things like parsing, tokenization, etc., which end up being quite relevant to a lot of data wrangling/cleaning workflows.