Web crawling expert CV

Professional Summary:

Senior Back-end Engineer with over 10 years of experience specializing in Rust-based data engineering, high-volume web scraping, and AI agent harness development. Track record of building scalable systems that process millions of pages daily, from distributed crawlers to single-view data platforms.

Core Competencies

  • Languages: Rust (main), Python (secondary), JavaScript (integration & tooling)
  • Expertise: large‑scale web scraping, distributed crawlers, single-view data systems, AI agent harness using Rust for low memory footprint and reliability through the ownership model, REST/gRPC APIs
  • Cloud & Infra: Kubernetes, Docker, AWS (ECS, S3, Lambda), GCP, Terraform, GitHub Actions, Prometheus/Grafana
  • Data Stores: PostgreSQL, Redis, MongoDB, SurrealDB, Qdrant, Redb/Fjall
  • Areas of Interests: Fully Homomorphic Encryption (FHE), Abstract Algebra, Type Theory, LLM reasoning, agentic automation

Project Specialization:

  • High-Scale Web Crawling:

    Designed and implemented distributed web crawlers handling millions of pages daily, significantly reducing processing time and infrastructure costs.

  • Machine Learning Data Preparation:

    Expertise in collecting, processing, and delivering high-quality datasets for training specialized Large Language Models (LLMs) and AI systems, ensuring accurate and robust performance.

  • Price Comparison and Data Aggregation Systems:

    Developed automated data extraction and transformation pipelines from numerous sources, continuously maintaining accurate and up-to-date data.

  • Consulting and Mentoring:

    Provided strategic technical consulting and mentoring, empowering teams to implement best practices, optimize existing solutions, and transition effectively to Rust-based architectures.

Education

ITMO University — Information Technology, Optical Design and Engineering (Russia)

Open Source Projects:

  • capp-rs: Modular framework for async web crawlers and data pipelines
  • dom-content-extraction: Rust implementation of the Content Extraction via Text Density algorithm
  • probabilistic-rs: Probabilistic data structures with persistence
  • exoplanets-catalog: Interactive catalog for NASA Exoplanet Archive, built with Leptos, Axum, and Polars
  • pageinfo-rs: Web page analysis tool for AI agents — extracts structure, metadata, and URL patterns
  • tarts: Terminal screen savers and visual effects in Rust

GitHub: github.com/oiwn

Rates:

  • Consulting and Mentoring: $200/hr
  • Development: $100/hr