Skip to content

Commit 22827ba

Browse files
SDK v2 development (#26)
Object model updates: * Task/Batch/Project models updated * A new method as_dict() introduced to access object as a dict * New ways to retrieve the list of tasks/batches: get_tasks and get_batches are the new generator methods for bulk retrieval API: * Isolated API access into a different class * Enabled HTTP retry for certain error codes * Improved error handling by differentiating exception types Infra improvements: * Enabled type hinting across the package * New code standards applied via Pylint, flake8 and black * Integrated pre-commit for a better/consistent developer experience * publish.sh introduced for an automated publish to PyPI * New pytest test cases are added Documentation * New Migration guide for v2 * New Developer Guide (how to setup repo env and configure pre-commit) * Updated deployment and publishing guide * Updated README for v2 * Made README to be available in PyPI
1 parent 139e521 commit 22827ba

21 files changed

+1753
-584
lines changed

.gitignore

Lines changed: 26 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,28 @@
1-
*.pyc
1+
# Byte-compiled / optimized / DLL files
2+
__pycache__/
3+
*.py[cod]
4+
*$py.class
5+
6+
# Distribution / packaging
7+
/build/
28
/dist/
3-
/*.egg-info
4-
.tox
5-
.cache
6-
/.vscode/
9+
*.egg
10+
*.eggs
11+
*.egg-info/
12+
MANIFEST
13+
14+
# For Visual Studio Code
15+
.vscode/
16+
17+
# Mac
718
.DS_Store
8-
/build/
19+
20+
# Unit test / coverage reports
21+
.[nt]ox/
22+
htmlcov/
23+
.coverage
24+
.coverage.*
25+
.*cache
26+
nosetests.xml
27+
coverage.xml
28+
*.cover

.pre-commit-config.yaml

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
default_language_version:
2+
python: python3.6
3+
default_stages: [commit]
4+
5+
repos:
6+
- repo: https://github.com/pre-commit/pre-commit-hooks
7+
rev: v3.2.0
8+
hooks:
9+
- id: trailing-whitespace
10+
- id: end-of-file-fixer
11+
- id: check-added-large-files
12+
- id: check-yaml
13+
- id: check-case-conflict
14+
- repo: https://github.com/pycqa/isort
15+
rev: 5.8.0
16+
hooks:
17+
- id: isort
18+
name: isort
19+
args: ["--profile", "black"]
20+
- repo: https://github.com/psf/black
21+
rev: 20.8b1
22+
hooks:
23+
- id: black
24+
- repo: https://gitlab.com/pycqa/flake8
25+
rev: 3.8.4
26+
hooks:
27+
- id: flake8
28+
- repo: local
29+
hooks:
30+
- id: pylint
31+
name: pylint
32+
entry: pylint
33+
language: python
34+
types: [python]
35+
files: scaleapi/
36+
additional_dependencies:
37+
- 'pylint>=2.7.4'
38+
- 'requests>=2.25.0'
39+
- 'urllib3>=1.26.0'
40+
- 'pytest>=6.2.2'
41+
language_version: python3.6

.pylintrc

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
[MASTER]
2+
disable=
3+
missing-module-docstring,
4+
too-few-public-methods,
5+
too-many-locals,
6+
too-many-arguments,
7+
too-many-instance-attributes,
8+
invalid-name,

MANIFEST

Lines changed: 0 additions & 5 deletions
This file was deleted.

README.rst

Lines changed: 121 additions & 60 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,22 @@
1-
=====================
1+
*********************
22
Scale AI | Python SDK
3-
=====================
3+
*********************
4+
5+
If you use earlier versions of the SDK, please refer to `v1.0.4 documentation <https://github.com/scaleapi/scaleapi-python-client/blob/release-1.0.4/README.rst>`_.
6+
7+
If you are migrating from earlier versions to v2, please refer to `Migration Guide to v2 <https://github.com/scaleapi/scaleapi-python-client/blob/master/docs/migration_guide.md>`_.
8+
9+
|pic1| |pic2| |pic3|
10+
11+
.. |pic1| image:: https://pepy.tech/badge/scaleapi/month
12+
:alt: Downloads
13+
:target: https://pepy.tech/project/scaleapi
14+
.. |pic2| image:: https://img.shields.io/pypi/pyversions/scaleapi.svg
15+
:alt: Supported Versions
16+
:target: https://pypi.org/project/scaleapi
17+
.. |pic3| image:: https://img.shields.io/github/contributors/scaleapi/scaleapi-python-client.svg
18+
:alt: Contributors
19+
:target: https://github.com/scaleapi/scaleapi-python-client/graphs/contributors
420

521
Installation
622
____________
@@ -9,8 +25,6 @@ ____________
925
1026
$ pip install --upgrade scaleapi
1127
12-
Note: We strongly suggest using `scaleapi` with Python version 2.7.9 or greater due to SSL issues with prior versions.
13-
1428
Usage
1529
_____
1630

@@ -23,11 +37,11 @@ Tasks
2337
_____
2438

2539
Most of these methods will return a `scaleapi.Task` object, which will contain information
26-
about the json response (task_id, status, etc.).
40+
about the json response (task_id, status, params, response, etc.).
2741

2842
Any parameter available in `Scale's API documentation`__ can be passed as an argument option with the corresponding type.
2943

30-
__ https://docs.scale.com/reference#task-object
44+
__ https://docs.scale.com/reference#tasks-object-overview
3145

3246
The following endpoints for tasks are available:
3347

@@ -38,15 +52,18 @@ This method can be used for any Scale supported task type using the following fo
3852

3953
.. code-block:: python
4054
41-
client.create_{{Task Type}}_task(...)
55+
client.create_task(TaskType, ...task parameters...)
4256
4357
Passing in the applicable values into the function definition. The applicable fields and further information for each task type can be found in `Scale's API documentation`__.
4458

45-
__ https://docs.scale.com/reference#general-image-annotation
59+
__ https://docs.scale.com/reference
4660

4761
.. code-block:: python
4862
49-
client.create_imageannotation_task(
63+
from scaleapi.tasks import TaskType
64+
65+
client.create_task(
66+
TaskType.ImageAnnotation,
5067
project = 'test_project',
5168
callback_url = "http://www.example.com/callback",
5269
instruction= "Draw a box around each baby cow and big cow.",
@@ -61,51 +78,65 @@ __ https://docs.scale.com/reference#general-image-annotation
6178
}
6279
)
6380
64-
Retrieve task
65-
^^^^^^^^^^^^^
81+
Retrieve a task
82+
^^^^^^^^^^^^^^^
6683

6784
Retrieve a task given its id. Check out `Scale's API documentation`__ for more information.
6885

6986
__ https://docs.scale.com/reference#retrieve-tasks
7087

7188
.. code-block :: python
7289
73-
task = client.fetch_task('asdfasdfasdfasdfasdfasdf')
74-
print(task.status) // Task status ('pending', 'completed', 'error', 'canceled')
75-
print(task.response) // If task is complete
90+
task = client.get_task('30553edd0b6a93f8f05f0fee')
91+
print(task.status) # Task status ('pending', 'completed', 'error', 'canceled')
92+
print(task.response) # If task is complete
7693
7794
List Tasks
7895
^^^^^^^^^^
7996

80-
Retrieve a list of tasks, with optional filter by start and end date/time. Paginated with `next_token`. The return value is a `scaleapi.Tasklist`, which acts as a list, but also has fields for the total number of tasks, the limit and offset, and whether or not there's more. Check out `Scale's API documentation`__ for more information.
97+
Retrieve a list of `Task` objects, with filters for: ``project_name``, ``batch_name``, ``type``, ``status``,
98+
``review_status``, ``unique_id``, ``completed_after``, ``completed_before``, ``updated_after``, ``updated_before``,
99+
``created_after``, ``created_before`` and ``tags``.
100+
101+
``get_tasks()`` is a **generator** method and yields ``Task`` objects.
102+
103+
`A generator is another type of function, returns an iterable that you can loop over like a list.
104+
However, unlike lists, generators do not store the content in the memory.
105+
That helps you to process a large number of objects without increasing memory usage.`
106+
107+
If you will iterate through the tasks and process them once, using a generator is the most efficient method.
108+
However, if you need to process the list of tasks multiple times, you can wrap the generator in a ``list(...)``
109+
statement, which returns a list of Tasks by loading them into the memory.
110+
111+
Check out `Scale's API documentation`__ for more information.
81112

82113
__ https://docs.scale.com/reference#list-multiple-tasks
83114

84115
.. code-block :: python
85116
86-
next_token = None;
87-
counter = 0
88-
all_tasks =[]
89-
while True:
90-
tasks = client.tasks(
91-
start_time = "2020-09-08",
92-
end_time = "2021-01-01",
93-
customer_review_status = "accepted",
94-
next_token = next_token,
95-
)
96-
for task in tasks:
97-
counter += 1
98-
print('Downloading Task %s | %s' % (counter, task.task_id))
99-
all_tasks.append(task.__dict__['param_dict'])
100-
next_token = tasks.next_token
101-
if next_token is None:
102-
break
103-
print(all_tasks)
117+
from scaleapi.tasks import TaskReviewStatus, TaskStatus
118+
119+
tasks = client.get_tasks(
120+
project_name = "My Project",
121+
created_after = "2020-09-08",
122+
completed_before = "2021-04-01",
123+
status = TaskStatus.Completed,
124+
review_status = TaskReviewStatus.Accepted
125+
)
126+
127+
# Iterating through the generator
128+
for task in tasks:
129+
# Download task or do something!
130+
print(task.task_id)
131+
132+
# For retrieving results as a Task list
133+
task_list = list(tasks)
134+
print(f"{len(task_list))} tasks retrieved")
104135
105136
Cancel Task
106137
^^^^^^^^^^^
107138

108-
Cancel a task given its id if work has not started on the task (task status is `Queued` in the UI). Check out `Scale's API documentation`__ for more information.
139+
Cancel a task given its id if work has not started on the task (task status is ``Queued`` in the UI). Check out `Scale's API documentation`__ for more information.
109140

110141
__ https://docs.scale.com/reference#cancel-task
111142

@@ -153,8 +184,13 @@ __ https://docs.scale.com/reference#batch-status
153184
154185
client.batch_status(batch_name = 'batch_name_01_07_2021')
155186
156-
Retrieve Batch
157-
^^^^^^^^^^^^^^
187+
# Alternative via Batch.get_status()
188+
batch = client.get_batch('batch_name_01_07_2021')
189+
batch.get_status() # Refreshes tasks_{status} attributes of Batch
190+
print(batch.tasks_pending, batch.tasks_completed)
191+
192+
Retrieve A Batch
193+
^^^^^^^^^^^^^^^^
158194

159195
Retrieve a single Batch. Check out `Scale's API documentation`__ for more information.
160196

@@ -167,27 +203,37 @@ __ https://docs.scale.com/reference#batch-retrieval
167203
List Batches
168204
^^^^^^^^^^^^
169205

170-
Retrieve a list of Batches. Check out `Scale's API documentation`__ for more information.
206+
Retrieve a list of Batches. Optional parameters are ``project_name``, ``batch_status``, ``created_after`` and ``created_before``.
207+
208+
``get_batches()`` is a **generator** method and yields ``Batch`` objects.
209+
210+
`A generator is another type of function, returns an iterable that you can loop over like a list.
211+
However, unlike lists, generators do not store the content in the memory.
212+
That helps you to process a large number of objects without increasing memory usage.`
213+
214+
When wrapped in a ``list(...)`` statement, it returns a list of Batches by loading them into the memory.
215+
216+
Check out `Scale's API documentation`__ for more information.
171217

172218
__ https://docs.scale.com/reference#batch-list
173219

174220
.. code-block :: python
175221
176-
next_token = None;
222+
from scaleapi.batches import BatchStatus
223+
224+
batches = client.get_batches(
225+
batch_status=BatchStatus.Completed,
226+
created_after = "2020-09-08"
227+
)
228+
177229
counter = 0
178-
all_batchs =[]
179-
while True:
180-
batches = client.list_batches(
181-
status = "completed"
182-
)
183-
for batch in batches:
184-
counter += 1
185-
print('Downloading Batch %s | %s | %s' % (counter, batch.name, batch.param_dict['status']))
186-
all_batchs.append(batch.__dict__['param_dict'])
187-
next_token = batches.next_token
188-
if next_token is None:
189-
break
190-
print(all_batchs)
230+
for batch in batches:
231+
counter += 1
232+
print(f'Downloading batch {counter} | {batch.name} | {batch.project}')
233+
234+
# Alternative for accessing as a Batch list
235+
batch_list = list(batches)
236+
print(f"{len(batch_list))} batches retrieved")
191237
192238
Projects
193239
________
@@ -221,7 +267,7 @@ __ https://docs.scale.com/reference#project-retrieval
221267
List Projects
222268
^^^^^^^^^^^^^
223269

224-
This function does not take any arguments. Retrieve a list of every Project.
270+
This function does not take any arguments. Retrieve a list of every Project.
225271
Check out `Scale's API documentation`__ for more information.
226272

227273
__ https://docs.scale.com/reference#batch-list
@@ -232,7 +278,7 @@ __ https://docs.scale.com/reference#batch-list
232278
projects = client.projects()
233279
for project in projects:
234280
counter += 1
235-
print('Downloading project %s | %s | %s' % (counter, project['name'], project['type']))
281+
print(f'Downloading project {counter} | {project.name} | {project.type}')
236282
237283
Update Project
238284
^^^^^^^^^^^^^^
@@ -245,23 +291,38 @@ __ https://docs.scale.com/reference#project-update-parameters
245291
246292
data = client.update_project(
247293
project_name='test_project',
248-
pathc = false,
294+
patch = false,
249295
instruction='update: Please label all the stuff',
250296
)
251297
252298
Error handling
253299
______________
254300

255301
If something went wrong while making API calls, then exceptions will be raised automatically
256-
as a `scaleapi.ScaleException` or `scaleapi.ScaleInvalidRequest` runtime error. For example:
302+
as a `ScaleException` parent type and child exceptions:
303+
304+
- ``ScaleInvalidRequest``: 400 - Bad Request -- The request was unacceptable, often due to missing a required parameter.
305+
- ``ScaleUnauthorized``: 401 - Unauthorized -- No valid API key provided.
306+
- ``ScaleNotEnabled``: 402 - Not enabled -- Please contact sales@scaleapi.com before creating this type of task.
307+
- ``ScaleResourceNotFound``: 404 - Not Found -- The requested resource doesn't exist.
308+
- ``ScaleDuplicateTask``: 409 - Conflict -- The provided idempotency key or unique_id is already in use for a different request.
309+
- ``ScaleTooManyRequests``: 429 - Too Many Requests -- Too many requests hit the API too quickly.
310+
- ``ScaleInternalError``: 500 - Internal Server Error -- We had a problem with our server. Try again later
311+
- ``ScaleTimeoutError``: 504 - Server Timeout Error -- Try again later.
312+
313+
Check out `Scale's API documentation <https://docs.scale.com/reference#errors>`_ for more details.
314+
315+
For example:
257316

258317
.. code-block:: python
259318
260-
try
261-
client.create_categorization_task('Some parameters are missing.')
262-
except scaleapi.ValidationError as e:
263-
print(e.code) # 400
264-
print(e.message) # missing param X
319+
from scaleapi.exceptions import ScaleException
320+
321+
try:
322+
client.create_task(TaskType.TextCollection, attachment='Some parameters are missing.')
323+
except ScaleException as err:
324+
print(err.code) # 400
325+
print(err.message) # Parameter is invalid, reason: "attachments" is required
265326
266327
Troubleshooting
267328
_______________

docs/dev_requirements.txt

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
black>=19.10b0
2+
flake8>=3.8.4
3+
pre-commit==2.11.1
4+
isort>=5.7.0
5+
pytest>=6.2.2
6+
pylint>=2.7.2

0 commit comments

Comments
 (0)