Project: java-podcast-processor

Sample solutions demonstrated by my java-podcast-processor project. You can find the source code here on Github. Check out some of my other projects here.

Build a Data-Intensive, Full Stack App from Start to Finish

Project screenshot

The strength of a full-stack data engineer is the ability to bring the entire stack together, coordinating all of your microservices and features into a single product. Any developer can slap a new feature onto your project, but if new features aren't seamlessly integrated into your project as a whole, they will run inefficiently and slow down future development. See how everything can work together, from data pipeline to web app to data visualization and user-facing search functionality. View on Github

ETL from Cassandra using Spark

Project screenshot

Cassandra DB performs writes fast and leaves read-heavy work to 3rd-party integrations. For example, Elassandra solves this with Elasticsearch and Datastax solves this with Solr and Spark (or even Graph depending on the use case). Of course, we could also integrate Cassandra with these same tools using open source connectors and drivers. Check out an example of how to extract your Cassandra into Spark for an ETL pipeline. View on Github

Orchestrate Data Pipelines Using Kafka

Project screenshot

Kafka coordinates your data pipelines as a message broker that sits in the middle of your distributed infrastructure. Adding Kafka to your project can help make everything run smoothly and efficiently, with exactly-once guarantees, event playback, and streaming support out of the box. View on Github

Provide Search for Cassandra DB using Elasticsearch and Flask

Project screenshot

Your data won't help you if you don't know how to use it. One way is to allow end-users (or admins) to search through your data. Elassandra integrates Elasticsearch with your Cassandra DB for near instant search results and a REST API. One way to access that REST API is by connecting to it through a React app, with a Flask app server in the middle to handle requests. Click here to get an idea for how I can build a similar solution for your app. View on Github