Test Data Based on Real Data in PySpark
Quickly build your test datasets based on your real data in similar schemas but with fake data.
Running PySpark from Docker
A very basic Docker setup for running a Jupyter Notebook and a Spark server with Spark UI, which will allow you to play around with new ideas and in general test PySpark locally without an expensive infrastructure.
Emil Moe

Software- and Data Engineer

I created this website to help you empower your infrastructure and so you don't need to spend the same amount of hours as me on researching. I chose to make the site ad-free, so if you like what I do, please consider supporting my Patreon.