Skip to content

Merge all previous parameters in case of a batch job restart #38194

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 5 commits into from
Closed

Merge all previous parameters in case of a batch job restart #38194

wants to merge 5 commits into from

Conversation

cachescrubber
Copy link
Contributor

@cachescrubber cachescrubber commented Nov 3, 2023

Merge all previous parameters in case of a batch job restart

When looking to replace a custom job launcher with JobLauncherApplicationRunner I noticed that upon job restart, it will not copy non-identifying parameters from the previous (failed) execution to the retry execution.

This PR uses the identifying parameter from the command line to find the failed job execution. All other parameters of the failed invocation are merged with the command-line.

This seems to be in sync with org.springframework.batch.core.launch.support.SimpleJobOperator#restart. Restart works using the failed instance-id and will copy all params of the failed execution - but lacks the possibility to override non-identifying params.

Scenario: Batch Job Definition uses a JobParametersIncrementer implementation.

initial / regular invocation

# java -jar batch-app.jar

The incrementer will generate three job parameters:

export_number,java.lang.Long,2,Y
time.start,java.time.LocalDateTime,1970-01-01T00:00:00,N
time.end,java.time.LocalDateTime,2023-11-06T07:44:42.138356,N

If the job execution fails, in order to restart the job, one would have to pass all three parameter on the command line

current restart invocation

# java -jar batch-app.jar \
export_number,java.lang.Long,2,true \
time.start,java.time.LocalDateTime,1970-01-01T00:00:00,false \
time.end,java.time.LocalDateTime,2023-11-06T07:44:42.138356,false

My PR would allow to restart the job using only the identifying parameter, while the non-identifying params would be copied from the failed invocation.

simplified restart invocation

# java -jar batch-app.jar export_number,java.lang.Long,2,true

The current restart invocation would still work, cli parameters would have precedence over the old non-identifying prameters. Identifying parameters must be identical anyway.

@spring-projects-issues spring-projects-issues added the status: waiting-for-triage An issue we've not yet triaged label Nov 3, 2023
@wilkinsona
Copy link
Member

@fmbenhassine, could you please take a look at this when you have a minute?

@cachescrubber
Copy link
Contributor Author

I could not register the Jackson2ExecutionContextStringSerializer. The only possibility right now is to completely opt-out of the BatchAutoConfiguration. I changed SpringBootBatchConfiguration to accept an optional bean of type ExecutionContextSerializer, using DefaultExecutionContextSerializer as default.

Since spring-batch 5.0.x, a JobOperator bean is part of the DefaultBatchConfiguration.

@fmbenhassine
Copy link
Contributor

The reason for not copying non-identifying parameters on a restart is because there is no way to remove them (see https://github.com/spring-projects/spring-boot/pull/14933/files#diff-1752b9b4da584632b57c0f8c4096419e0207f4cd0f9a48a48629176e1e7c0c84R207-R210).

With this PR, and based on the given example, one would not be able to remove the time.start and time.end parameters on a restart (since they will be copied from the previous execution). So while this PR might simplify things, it actually removes a feature.

The typical example where one wants to remove a non-identifying parameter on a restart is when the parameter was passed by mistake (or with a wrong value and should be removed).


The other changes regarding the serializer and job operator LGTM.

@cachescrubber
Copy link
Contributor Author

@fmbenhassine Thanks for the review and clarification. I restored the previous behavior using lastExecution.getJobParameters().getIdentifyingParameters() in favor of getGetIdentifying().

I'm still unsure if the merge is necessary at all, since the identify params must be the same in order to find the failed execution in the first place.

@fmbenhassine
Copy link
Contributor

The merge is necessary to add non-identifying parameters, as they are not copied from the last execution (the identifying ones will be overridden with the same values anyway).

I think the documentation should be updated as well with a note about this use case (ie mention that on a restart, non-identifying parameters must be respecified if the are are still required or need to be modified, otherwise they will be removed). In other words, document what you have noticed : I noticed that upon job restart, it will not copy non-identifying parameters from the previous (failed) execution to the retry execution.

@cachescrubber
Copy link
Contributor Author

close in favor of #38234

@wilkinsona wilkinsona added status: superseded An issue that has been superseded by another and removed status: waiting-for-triage An issue we've not yet triaged labels Nov 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: superseded An issue that has been superseded by another
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants