Big Data Consultant
Remobi
Embu das Artes, SP - há 1 hora
Descrição do trabalho
Job Title: Big Data Engineer

Location: Remote

Employment Type: [Full-Time/Contract]

Department: Data Engineering / Analytics

About the Role:

We are looking for a highly skilled and experienced Big Data Engineer to join our growing data team. As a Big Data Engineer, you will be responsible for designing, developing, and optimizing scalable data pipelines and architectures that enable data-driven decision-making across the organization. You'll work closely with data scientists, analysts, and software engineers to ensure reliable, efficient, and secure data infrastructure.

Key Responsibilities:- Design, develop, and maintain robust and scalable data pipelines for batch and real-time processing.
  • Build and optimize data architectures to support advanced analytics and machine learning workloads.
  • Ingest data from various structured and unstructured sources using tools like Apache Kafka, Apache NiFi, or custom connectors.
  • Develop ETL/ELT processes using tools such as Spark, Hive, Flink, Airflow, or DBT.
  • Work with big data technologies such as Hadoop, Spark, HDFS, Hive, Presto, etc.
  • Implement data quality checks, validation processes, and monitoring systems.
  • Collaborate with data scientists and analysts to ensure data is accessible, accurate, and clean.
  • Manage and optimize data storage solutions including cloud-based data lakes (AWS S3, Azure Data Lake, Google Cloud Storage).
  • Implement and ensure compliance with data governance, privacy, and security best practices.
  • Evaluate and integrate new data tools and technologies to enhance platform capabilities.


Required Skills and Qualifications:
  • Bachelor's or Master’s degree in Computer Science, Engineering, Information Systems, or related field.
  • 3+ years of experience in data engineering or software engineering roles with a focus on big data.
  • Strong programming skills in Python, Scala, or Java.
  • Proficiency with big data processing frameworks such as Apache Spark, Hadoop, or Flink.
  • Experience with SQL and NoSQL databases (e.g., PostgreSQL, Cassandra, MongoDB, HBase).
  • Hands-on experience with data pipeline orchestration tools like Apache Airflow, Luigi, or similar.
  • Familiarity with cloud data services (AWS, GCP, or Azure), particularly services like EMR, Databricks, BigQuery, Glue, etc.
  • Solid understanding of data modeling, data warehousing, and performance optimization.
  • Experience with CI/CD for data pipelines and infrastructure-as-code tools like Terraform or CloudFormation is a plus.


Preferred Qualifications:
  • Experience working in agile development environments.
  • Familiarity with containerization tools like Docker and orchestration platforms like Kubernetes.
  • Knowledge of data privacy and regulatory compliance standards (e.g., GDPR, HIPAA).
  • Experience with real-time data processing and streaming technologies (e.g., Kafka Streams, Spark Streaming).


Why Join Us:
  • Work with a modern data stack and cutting-edge technologies.
  • Be part of a data-driven culture in a fast-paced, innovative environment.
  • Collaborate with talented professionals from diverse backgrounds.