Data Architect
Company: Bitewell
Location: San Francisco
Posted on: April 23, 2025
Job Description:
Data ArchitectApplicants must have permanent work authorization
in the United States. Bitewell is unable to provide visa
sponsorship or support visa transfers for this position.Are you a
data architecture visionary ready to build the systems that will
revolutionize how millions understand food health?Bitewell is
seeking a highly technical and visionary Data Architect to lead the
design, implementation, and governance of our core data
infrastructure. As the business evolves into one whose primary
asset is a deep, extensible, and authoritative data corpus, this
role is central to engineering the systems, flows, and validation
regimes that will power our data ecosystem.The ideal candidate will
possess a deep command of data architecture fundamentals -
including data modeling, pipeline construction, lineage tracking,
observability, and schema evolution - and will bring an engineer's
mindset to problems of scale, governance, and automation. You will
define how data flows, how it is structured, validated, and made
trustworthy across every component of the enterprise.About
BitewellOur vision is to create a world without diet-related
disease. We're making this happen through our core product: The
FoodHealth Score. The score is our proprietary, first-of-its-kind
nutrition scoring system that combines cutting-edge science with
innovative technology to give consumers insight and transparency
into the healthfulness of food. We bring this to market with our
partners in food retail, manufacturing, and data, which empowers
customers to make healthier choices, merchandisers to stock
healthier shelves and brands to make healthier products.What You'll
Do
- Data Flow & Infrastructure Architecture
- Design and document source-to-consumption data flows, including
data ingress from external APIs, internal services, user-facing
applications, and machine learning pipelines.
- Architect modular, scalable data pipelines for ingestion,
transformation, and loading, enabling traceability and reusability
at every stage.
- Define and implement data provenance protocols, capturing
metadata and lineage across all processing layers.
- Engineer streaming and batch processing systems using tools
such as Apache Kafka, Apache Beam, or Airflow/Dbt, ensuring event
ordering and idempotency.
- Schema & Corpus Design
- Model normalized, distributed data schemas that preserve data
integrity and avoid redundancy across domains.
- Lead the definition and maintenance of a canonical data model
and schema registry to unify internal data representations.
- Develop strategies for schema evolution that support
backward/forward compatibility across dependent systems.
- Validation & Observability
- Define and implement data validation frameworks - including
schema conformance, missingness detection, statistical anomaly
detection, and referential integrity enforcement.
- Design and deploy data observability platforms with SLAs around
freshness, completeness, and accuracy.
- Build automated audit and reconciliation processes to identify
and alert on pipeline regressions or integrity violations.
- Lineage, Governance & Metadata Management
- Implement end-to-end data lineage tracking (both static and
dynamic) across ingestion, transformations, and delivery
pipelines.
- Collaborate with Security and Compliance teams to enforce data
governance policies, including retention, access control, and PII
masking where applicable.
- Establish a metadata layer (technical + business) that
underpins data discoverability and semantic context for
consumers.
- Team Enablement & Documentation
- Champion internal data standards and best practices, mentoring
engineers on principled data design and pipeline development.
- Write technical specifications, data contracts, lineage
diagrams, and architecture documentation consumable by both
engineering and business stakeholders.
- Serve as a cross-functional bridge between data engineering,
software, product, analytics, and machine learning teams.Who You
Are
- 7+ years experience in Data Architecture, Data Engineering, or
a similar backend-focused discipline.
- Mastery of SQL, data modeling (3NF, star/snowflake, dimensional
modeling), and ETL/ELT frameworks.
- Proven experience with distributed data systems (PostgreSQL,
BigQuery, Snowflake, Redshift, Delta Lake, etc.).
- Deep knowledge of data pipeline orchestration tools (Airflow,
Dagster, Prefect) and transformation layers (dbt, Spark,
Flink).
- Experience building data quality systems (e.g., Great
Expectations, Soda, Deequ) and schema validation engines.
- Fluency in metadata and lineage tools (OpenLineage, DataHub,
Amundsen, Marquez).
- Expertise in observability tooling (e.g., Monte Carlo, Datadog,
Grafana, OpenTelemetry for data).
- Comfort with containerized environments (Docker, Kubernetes)
and infrastructure-as-code (Terraform, Pulumi).
- Familiarity with event-driven architecture and streaming data
frameworks (Kafka, Pulsar, Kinesis).
- Strong documentation and systems design skills with a bias for
clarity, versioning, and reproducibility.Added BonusPreferred
Qualifications
- Experience building data platforms in health, food, e-commerce,
or scientific domains.
- Familiarity with semantic layer modeling, data catalogs, and
data mesh principles.
- Experience supporting AI/ML model pipelines, including feature
stores and training set lineage.
- Background in managing large-scale historical data backfills,
data remediation workflows, or schema refactoring in
production.
- Understanding of regulatory compliance frameworks (HIPAA, GDPR,
CCPA) as they relate to data governance.What Success Looks Like
- All business-critical data pipelines are documented,
version-controlled, observable, and traceable.
- Schema changes are safe, validated, and governed across
environments and teams.
- Data flows and structures enable confidence scoring, downstream
analytics, and ML inference without loss of provenance or
integrity.
- The business's data corpus becomes a first-class system of
record - authoritative, explainable, extensible.Why Bitewell?We're
not just building another app - we're creating technology that
could fundamentally change how people eat and live.Our culture
values boldness, intellectual curiosity, and a hands-on approach.
We're looking for someone who can bring their authentic self to
work, engage in thought-provoking discussions, and contribute to
our mission with both expertise and enthusiasm.Ready to help us
make "FoodHealth Score" a household term? Apply now to join our
team and make an impact on one of the most important health
challenges of our time.Additional DetailsLocation: San
FranciscoCompensation: $150K - $160K with flexibility based on your
experience and qualifications. Total compensation package includes
equity & bonus and full benefits package.Bitewell celebrates
diversity and is committed to building an inclusive environment for
all. We're an equal opportunity employer and don't discriminate on
the basis of race, color, ethnicity, religion, sex, gender
identity, sexual orientation, age, disability status, or any other
protected characteristic. We encourage candidates from all
communities to apply and bring their authentic selves to our team
as we work together to revolutionize how people make food
choices.
#J-18808-Ljbffr
Keywords: Bitewell, San Francisco , Data Architect, Other , San Francisco, California
Didn't find what you're looking for? Search again!
Loading more jobs...