Description
Description
This will be a bit longer. I'll describe the behaviour first, then what I found by debugging:
I am running a private instance of gitea 1.22.0+rc1-126-gec771fdfcd and act_runner release 0.2.10
The runner is registered as repository runner.
When act_runner connects and finds a (matching) task waiting, it will take the task and run it correctly and without problems. It will then wait for more task.
However if i then trigger another task (of the same workflow), it will not take that task and just sit idle forever.
If it connects and there is no task waiting, it keeps waiting. If a workflow is then triggered and a task waiting, it will not receive the task and handle it.
So the runner config and operation itself is fine but it will only ever process one job.
This is what I found after some debugging:
The runner will continuously /api/actions/runner.v1.RunnerService/FetchTask but not receive more than one task, so the problem seemed to be on the gitea side.
I looked into FetchTask() in ./routers/api/actions/runner/runner.go
After some debugging, i found that the comparison
https://github.com/go-gitea/gitea/blob/main/routers/api/actions/runner/runner.go#L155
is why no second task is delivered.
At the first (working) call, the runner sends tasksVersion as 0 and that is compared with latestVersion which is 1. So pickTask gets called and a task is sent back.
From the next call on, the runner sends 1, it gets compared to latestVersion 1 and pickTask is not called.
The comparison is part of
I dug deeper into GetTasksVersionByScope and IncreaseTaskVersion.
And IncreaseTaskVersion is also called each time a job changes state.
And increaseTasksVersionByScope calls increaseTasksVersionByScope for 3 scopes. Once with the ownerID set to 0, once with the
repoID set to 0 and once with both set to 0.
Also, increaseTasksVersionByScope will implicitely insert if no existing record is found.
So for my runner with ownerID 3 and repoID 1, i get three records:
mysql> select * from action_tasks_version;
+----+----------+---------+---------+--------------+--------------+
| id | owner_id | repo_id | version | created_unix | updated_unix |
+----+----------+---------+---------+--------------+--------------+
| 1 | 0 | 0 | 15990 | 1716545692 | 1716545692 |
| 2 | 3 | 0 | 24066 | 1716545692 | 1716545692 |
| 3 | 0 | 1 | 15990 | 1716545692 | 1716545692 |
+----+----------+---------+---------+--------------+--------------+
3 rows in set (0.00 sec)
However, querying the version is done differently.
GetTasksVersionByScope just does:
has, err := db.GetEngine(ctx).Where("owner_id = ? AND repo_id = ?", ownerID, repoID).Get(&tasksVersion)
And for ownerID 3 and repoID 1, this never returns a matching value since the insert command always sets at least one value to 0.
The function returns 0, since no value was found (and IncreaseTaskVersion gets called but that again wont create a matching entry).
The value of 0 gets increased to 1 by latestVersion++ and it gets compared to the 1 the runner sent. It matches, so pickTask is not called.
I may completely misunderstand how this is supposed to work (and I have never programmed in Go), but it seems this whole mechanism will not work for repository scope runners where both ownerID and repoID are non-zero.
A friend tried this out with an org-scope runner and there it works fine.
Since the whole mechanism is just an optizmiation, I would just disable the comparison and everything works again.
But I'm unsure how to fix this properly, so I did not include a PR.....
Using the maximum of all scopes could work for this case
SELECT MAX(version) FROM action_tasks_version WHERE (owner_id = 3) or (repo_id = 1) or (owner_id = 0 and repo_id = 0);
but I'm not sure if this breaks other things and how to get this into the xorm.
I didn't try to reproduce it on the demo site yet.
I set both gitea and act_runner to debug logging, but the runner is silent and gitea just produces lots of
2024-05-24T21:17:48.988208+00:00 localhost gitea[93710]: 2024/05/24 21:17:48 ...s/process/manager.go:188:Add() [T] Start 665103fc: POST: /api/actions/runner.v1.RunnerService/FetchTask (request)
2024-05-24T21:17:48.988286+00:00 localhost gitea[93710]: 2024/05/24 21:17:48 ...eb/routing/logger.go:47:func1() [T] router: started POST /api/actions/runner.v1.RunnerService/FetchTask for 4.180.7.35:37748
2024-05-24T21:17:49.020301+00:00 localhost gitea[93710]: 2024/05/24 21:17:49 ...eb/routing/logger.go:102:func1() [I] router: completed POST /api/actions/runner.v1.RunnerService/FetchTask for 4.180.7.35:37748, 200 OK in 32.2ms @ <autogenerated>:1(http.Handler.ServeHTTP-fm)
2024-05-24T21:17:49.020413+00:00 localhost gitea[93710]: 2024/05/24 21:17:49 ...s/process/manager.go:231:remove() [T] Done 665103fc: POST: /api/actions/runner.v1.RunnerService/FetchTask
calls.
I have a lot of logs where I manually and very crudely added debug output. I can provide these, if they help.
Gitea Version
1.22.0+rc1-126-gec771fdfcd
Can you reproduce the bug on the Gitea demo site?
No
Log Gist
No response
Screenshots
No response
Git Version
2.43.0
Operating System
Ubuntu 24.04
How are you running Gitea?
Bug shows up both with the rc1 binary as well as self-built from main.
I run gitea from systemd without docker.
act_runner is running directly on the system and from commandline. it processes task in docker.
Database
MySQL/MariaDB