DSC 232R: Big Data Analytics Using Spark

Jan 5, 2026 · 1 min read

Course Overview

DSC 232R: Big Data Analytics Using Spark is a graduate-level course in the Master of Data Science program at UC San Diego’s Halicioglu Data Science Institute.

Topics Covered

Distributed computing fundamentals
Apache Spark architecture and programming model
Large-scale data processing with PySpark
Machine learning at scale with MLlib
Real-world applications in genomics and industry
Cloud computing and cluster management

Learning Outcomes

Students completing this course will be able to:

Design and implement distributed data processing pipelines
Apply machine learning algorithms to large-scale datasets
Optimize Spark applications for performance
Work with real-world big data from genomics and other domains

Offering Schedule

Spring 2024
Fall 2025
Winter 2026

Last updated on Jan 5, 2026

Data Science Big Data Spark Machine Learning

Authors

Edwin Solares (he/him)

Lecturer in Computer Science & Data Science

I am a computational biologist and data scientist bridging artificial intelligence, evolutionary genomics, and climate-resilient agriculture. My research leverages cutting-edge machine learning and bioinformatics to address global food security challenges in the face of rapid climate change. With publications in high-impact journals including Nature Plants, PNAS, and Genome Research (h-index: 7), I develop tools and methods that advance both computational science and real-world applications.

← CSE 150A: Introduction to AI - Probabilistic Reasoning Jan 5, 2026

CSE 150B: Introduction to AI - Search and Reasoning Sep 15, 2025 →

No results found

DSC 232R: Big Data Analytics Using Spark

Course Overview

Topics Covered

Learning Outcomes

Offering Schedule