About Specialization Fundamentals of NoSQL, Big Data and Spark course

Big Data engineers and professionals with NoSQL skills are in high demand in the data management industry. This specialization is designed for those who want to develop fundamental skills in Big Data, Apache Spark, and NoSQL databases. Three information-rich courses cover popular NoSQL databases like MongoDB and Apache Cassandra, the widely used Apache Hadoop ecosystem of Big Data tools, and the Apache Spark analytics engine for processing large-scale data. You will start with an overview of the different categories of NoSQL data stores (not just SQL), and then work manually with some of them, including IBM Cloudant, MonogoDB, and Cassandra. You will perform various data management tasks such as creating and replicating databases, inserting, updating, deleting, querying, indexing, aggregating, and partitioning data. Next, you will gain fundamental knowledge of Big Data technologies such as Hadoop, MapReduce, HDFS, Hive, and HBase, followed by deeper knowledge of Apache Spark, Spark Dataframes, Spark SQL, PySpark, Spark Application UI, and scaling Spark with Kubernetes. In the final course, you will learn how to work with Spark Structured Streaming Spark ML - to perform Extract, Transform, and Load (ETL) data processing and machine learning tasks. This specialization is suitable for aspiring NoSQL and Big Data professionals - whether you are or are training to become a data engineer, software developer, IT architect, data scientist, or IT manager. Applied learning project The focus of this specialization is on hands-on learning. Therefore, each course includes practical exercises to practice and apply the NoSQL and Big Data skills you learn in lectures. In the first course, you will work with several NoSQL databases - MongoDB, Apache Cassandra and IBM Cloudant, performing various tasks: creating a database, adding documents, querying data, using HTTP API, performing Create, Read, Update and Delete (CRUD) operations, limiting and sorting records, indexing, aggregation, replication, using CQL shell, keyspace operations and other table operations. In the next course, you will spin up a Hadoop cluster using Docker and run Map Reduce jobs. You will learn Spark using Jark and learn Spark using Jupyter notebooks on the Python kernel. You will develop your Spark skills using DataFrames, Spark SQL, and scale your jobs using Kubernetes. In the final course, you will use Spark for ETL processing and to train and deploy Machine Learning models with IBM Watson.

Company

IBM

Resources

Website

More gallery

Oops! It looks like you need to sign up

Before leaving a review you need to create an account. Don't worry, it only takes a moment and gives you access to exclusive content and updates. Ready to get started?