refactor: split GitLab parser into focused modules#15
Conversation
Deploying with
|
| Status | Name | Latest Commit | Updated (UTC) |
|---|---|---|---|
| ✅ Deployment successful! View logs |
opal | a11b181 | Apr 07 2026, 07:34 PM |
Split the GitLab parser entry point into include resolution, normalization, job parsing, and dedicated tests without changing supported behavior.
| let project_id = percent_encode(project); | ||
| let file_id = percent_encode(&relative.to_string_lossy()); | ||
| let ref_id = percent_encode(reference); | ||
| let url = |
There was a problem hiding this comment.
@interstella5555 remove the curl command. add a proper way to execute http requests using a library
There was a problem hiding this comment.
Current code is up to date here: the include fetch path now uses reqwest::Client in IncludeIo::fetch_project_file, so the shell curl path is gone.
| stack.push(path.to_path_buf()); | ||
|
|
||
| let content = | ||
| fs::read_to_string(path).with_context(|| format!("failed to read {:?}", path))?; |
There was a problem hiding this comment.
@interstella5555 separate fs IO from this function such that it can be tested.
also, this does not use async
There was a problem hiding this comment.
Current code is up to date here: filesystem and remote IO are now split behind IncludeIo, and the resolver uses async tokio::fs helpers instead of synchronous file access.
| @@ -0,0 +1,430 @@ | |||
| use crate::{GitLabRemoteConfig, env, runtime}; | |||
There was a problem hiding this comment.
@interstella5555 this file does not use async, refactor it to use async. do not use spawn_block or spawn unless you have a good reason to.
There was a problem hiding this comment.
Updated in 69737ad: include_resolver.rs is now async end-to-end, with async filesystem access through tokio::fs and remote fetching through reqwest::Client, without adding spawn or spawn_blocking.
Replace the include resolver curl subprocess with async reqwest, move resolver I/O behind a dedicated helper, and thread async pipeline loading through the parser and plan call sites.
| } | ||
|
|
||
| async fn canonicalize(&self, path: &Path) -> Result<PathBuf> { | ||
| tokio::fs::canonicalize(path) |
There was a problem hiding this comment.
Updated in 69737ad: I switched this module to top-level imports, including tokio::fs and the aliased serde_yaml imports, instead of repeating module-qualified calls.
crates/opal/src/executor/core.rs
Outdated
| @@ -88,7 +88,8 @@ impl ExecutorCore { | |||
| // TODO: this shit does way too much, hard to test if you add fs::create inside of it | |||
There was a problem hiding this comment.
@interstella5555 did you do address the concerns in this todo?
this function mixes a lot of concerns
There was a problem hiding this comment.
Updated in 5603857: I pulled the directory setup and executor env construction out of ExecutorCore::new into prepare_executor_directories[_at] and build_executor_env[_from], and added direct unit coverage for those extracted paths.
|
|
||
| for (key, value) in mapping { | ||
| match key { | ||
| Value::String(name) if name == "before_script" => { |
There was a problem hiding this comment.
this should be an enum, then the conversion should be much easier to do and more ergonomic. also, this should be easy to extend, because this represents the features of the gitlab pipeline
There was a problem hiding this comment.
Updated in 69737ad: the parser dispatch here is now typed with enums (PipelineKeyword, DefaultKeyword, JobSectionKey) rather than raw string matching, so extending supported GitLab keys is more direct.
|
|
||
| use super::merge_mappings; | ||
|
|
||
| type ResolverFuture<'a, T> = Pin<Box<dyn Future<Output = Result<T>> + Send + 'a>>; |
There was a problem hiding this comment.
This extra indirection is gone in 69737ad; I simplified the code path and kept the direct form instead of carrying the additional helper/alias forward.
| } | ||
|
|
||
| async fn try_exists(&self, path: &Path) -> Result<bool> { | ||
| tokio::fs::try_exists(path) |
There was a problem hiding this comment.
jesus, use tokio import up top for all of the below cases
There was a problem hiding this comment.
Updated in 69737ad: the repeated tokio::fs::... calls were collapsed to a single top-level use tokio::fs; import and fs::... call sites.
| let response = self | ||
| .client | ||
| .get(&url) | ||
| .header("PRIVATE-TOKEN", gitlab.token.as_str()) |
There was a problem hiding this comment.
use context7, find out what the correct path is for this and how to set the token.
There was a problem hiding this comment.
Updated in 69737ad: I verified the GitLab repository file raw endpoint and PRIVATE-TOKEN header via Context7, then moved that into gitlab_repository_file_raw_url(...) plus GITLAB_PRIVATE_TOKEN_HEADER so the contract is explicit in code.
| return Ok(None); | ||
| }; | ||
| let rules: Vec<JobRule> = | ||
| serde_yaml::from_value(rules_value.clone()).context("failed to parse workflow.rules")?; |
There was a problem hiding this comment.
Updated in 69737ad: this module now uses top-level aliased imports for the repeated serde_yaml calls (yaml_from_str, yaml_from_value).
| fn parse_job(value: Value) -> Result<ParsedJobSpec> { | ||
| match value { | ||
| Value::Mapping(mut map) => { | ||
| let image_value = map.remove(Value::String("image".to_string())); |
There was a problem hiding this comment.
Updated in 69737ad: this path now uses typed enums for the relevant parser states instead of raw string matching, so extending the supported GitLab keys stays localized.
| } | ||
| } | ||
|
|
||
| fn build_graph( |
There was a problem hiding this comment.
break this function down, separate the logic into smaller ones, it does too many things
There was a problem hiding this comment.
Updated in 69737ad: build_graph is now just the coordinator; the work is split across GraphBuilder, build_job, and the smaller resolve_* helpers so the previous single large function is gone.
| inherit.interruptible = *value; | ||
| } | ||
| RawInheritDefault::List(entries) => { | ||
| inherit.image = entries.iter().any(|entry| entry == "image"); |
There was a problem hiding this comment.
Updated in 69737ad: this branch was converted to enum-backed parsing as part of the parser cleanup, so it no longer relies on the previous stringly dispatch.
| Ok(()) | ||
| } | ||
|
|
||
| fn parse_artifact_when(value: Option<&str>, job_name: &str) -> Result<ArtifactWhen> { |
There was a problem hiding this comment.
make an enum out of states as well.
There was a problem hiding this comment.
Updated in 69737ad: the state parsing here is now typed through ArtifactWhenKeyword before converting to ArtifactWhen.
| use tracing::warn; | ||
|
|
||
| use super::super::{ | ||
| graph::{ |
There was a problem hiding this comment.
Updated in a11b181: flattened this import so graph and rules are imported separately at the top level instead of through the nested grouped import.
Summary
include:projectfetching behind a dedicatedIncludeResolver!referencenormalization separate from graph and job parsing logicWhy
The old
crates/opal/src/gitlab/parser.rsmixed file loading, include traversal, normalization, graph building, and tests in one monolithic module. This refactor reduces coupling without changing the supported GitLab behavior.Closes CLO-139
Linear: https://linear.app/cloudflavor/issue/CLO-139/split-gitlab-parser-into-include-normalization-and-job-parsing-modules
Validation
cargo fmt --allcargo test -p opal gitlab::parser --libcargo check -p opalcargo test -p opal --libcargo clippy -p opal --all-targets -- -D warningsrust-checksslice:Success(gitlab-ci-f7175817)