Data Engineer

Bombora

Bombora provides a global B2B intent platform powered by the world’s largest publisher data co-op. Our data allows sales teams to base their actions on the knowledge of which companies are in-market for their products, and empowers marketing teams to practice #SustainableMarketing.We process billions of content interactions daily to detect intent signals from companies around the world. To do this, we use modern distributed processing and machine learning technologies, including Spark, Dataflow/Beam, Airflow, PyTorch, and BigQuery.
This is a full-time position with competitive benefits and compensation, preferably based in our Reno, NV office - though we are open to remote candidates as long as you are willing to travel to Reno occasionally.
What you will do
You will join our Data Engineering team, working alongside our data scientists and ML engineers to support Bombora R&D’s mission to design, develop and maintain our world class B2B DaaS products, leveraging machine intelligence and web-content consumption data at-scale. You will do this by:
  • Creating and refining bounded (batch) and unbounded (streaming) ETL and ML data pipelines that comprise our production systems
  • Advancing development and integration of our major analytics and ML codebases using modern and rigorous software engineering principles
  • Helping to support and maintain our live production ETL and ML pipelines and systems
  • Mentoring and advancing the development of your colleagues
  • Having fun in an environment of collaboration, curiosity, and experimentation
Specific Responsibilities:
  • Develop applications, libraries and workflows with Python, Java, Apache Spark, Apache Beam, and Apache Airflow
  • Design and implement systems that run at scale on Google’s Dataproc, Dataflow, Kubernetes Engine, Pub/Sub, and BigQuery platforms.
  • Learn, design and implement algorithms and machine learning operations, at-scale, using SciPy, PySpark, Spark Streaming, and MLBase libraries.
  • Learn and advance existing data models for our events, profiles and other datasets
  • Employ test-driven development, performance benchmarking, rapid release schedule, and continuous integration.
  • Participate in daily stand-ups, story planning, reviews, retrospectives, and the occasional outings to nearby local cuisine and / or culture.
About you:- Your background:
  • Education: B.S. / M.S. in computer science, physics, electrical engineering, applied mathematics, or equivalent experience.
  • Work experience: 3+ years of real-world development experience and 2+ years of experience with cloud and/or big-data platforms, GCP experience preferred.
  • Language Fluency: In Java / Python (at least 2 years of experience on commercial projects) and perhaps a few other languages.
  • Data wrangler: Driven by data and the ability to leverage data to understand systems.
  • Impactful and effective: Live and breathe TDD and agile methodologies in software to great impact
  • Character: Creativity, pragmatism, curiosity, and a good sense of humor
- Working knowledge of:
  • Algorithms / Data Structures: Design patterns, efficiency, using the right abstraction for the task.
  • Functional Programming: Filters and maps, currying and partial evaluation, group-by and reduce-by
  • OOP: Object paradigms to build components, when needed.
  • Databases: Familiar with both relational (MySQL, PostgreSQL) and NoSQL (HBase, Cassandra, etc).
  • Data Processing at scale: Distributed computations, map-reduce paradigm, and streaming processing, Spark experience is helpful.
  • Build and release toolchains: Experience deploying projects in both Python (conda, setuptools, etc.) and Java (Maven).
  • Git: Comfortable working with git (resolving conflicts, merging, branching, etc).
Subscribe Now