Data analysis has replaced data acquisition as the bottleneck to evidence-based decision making --- we are drowning in it. Extracting knowledge from large, heterogeneous, and noisy datasets requires not only powerful computing resources, but the programming abstractions to use them effectively. The abstractions that emerged in the last decade blend ideas from parallel databases, distributed systems, and programming languages to create a new class of scalable data analytics platforms that form the foundation for data science at realistic scales.Read more.
This resource is offered by an affiliate partner. If you pay for training, we may earn a commission to support this site.
The techniques and tools covered in Data Manipulation at Scale: Systems and Algorithms are most similar to the requirements found in Data Scientist job advertisements.