2025-08-09

Web crawling expert CV

Professional Summary:

Senior Back-end Engineer with over 10 years of experience specializing in Rust-based data engineering, high-volume web scraping, and AI agent harness development. Track record of building scalable systems that process millions of pages daily, from distributed crawlers to single-view data platforms.

Core Competencies

Languages: Rust (main), Python (secondary), JavaScript (integration & tooling)
Expertise: large‑scale web scraping, distributed crawlers, single-view data systems, AI agent harness using Rust for low memory footprint and reliability through the ownership model, REST/gRPC APIs
Cloud & Infra: Kubernetes, Docker, AWS (ECS, S3, Lambda), GCP, Terraform, GitHub Actions, Prometheus/Grafana
Data Stores: PostgreSQL, Redis, MongoDB, SurrealDB, Qdrant, Redb/Fjall
Areas of Interests: Fully Homomorphic Encryption (FHE), Abstract Algebra, Type Theory, LLM reasoning, agentic automation

Project Specialization:

High-Scale Web Crawling:

Designed and implemented distributed web crawlers handling millions of pages daily, significantly reducing processing time and infrastructure costs.
Machine Learning Data Preparation:

Expertise in collecting, processing, and delivering high-quality datasets for training specialized Large Language Models (LLMs) and AI systems, ensuring accurate and robust performance.
Price Comparison and Data Aggregation Systems:

Developed automated data extraction and transformation pipelines from numerous sources, continuously maintaining accurate and up-to-date data.
Consulting and Mentoring:

Provided strategic technical consulting and mentoring, empowering teams to implement best practices, optimize existing solutions, and transition effectively to Rust-based architectures.

Education

ITMO University — Information Technology, Optical Design and Engineering (Russia)

Open Source Projects:

capp-rs: Modular framework for async web crawlers and data pipelines
dom-content-extraction: Rust implementation of the Content Extraction via Text Density algorithm
probabilistic-rs: Probabilistic data structures with persistence
exoplanets-catalog: Interactive catalog for NASA Exoplanet Archive, built with Leptos, Axum, and Polars
pageinfo-rs: Web page analysis tool for AI agents — extracts structure, metadata, and URL patterns
tarts: Terminal screen savers and visual effects in Rust

GitHub: github.com/oiwn

Rates:

Consulting and Mentoring: $200/hr
Development: $100/hr