Skip to content

Commit 0907472

Browse files
authored
Merge pull request #1375 from scala/update-github-links
2 parents fc05a0f + dc2e90d commit 0907472

File tree

1 file changed

+17
-17
lines changed

1 file changed

+17
-17
lines changed

blog/_posts/2017-08-28-gsoc-connecting-contributors-with-projects.md

Lines changed: 17 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ For Google Summer of Code 2017, my [project](https://summerofcode.withgoogle.com
1717
[![front-page-contributing](/resources/img/blog/scaladex/front-page-contributing.png)](/resources/img/blog/scaladex/front-page-contributing.png)
1818
*Highlighted projects with Contributing Info on the front page of Scaladex*
1919

20-
Furthermore, I improved the search feature of Scaladex by adding [Github Topics](https://github.com/blog/2309-introducing-topics) to the projects stored in Scaladex so that users can search projects based on Topics. Topics are essentially categories that open-source projects belong to like android, databases, json, ...
20+
Furthermore, I improved the search feature of Scaladex by adding [GitHub Topics](https://github.com/blog/2309-introducing-topics) to the projects stored in Scaladex so that users can search projects based on Topics. Topics are essentially categories that open-source projects belong to like android, databases, json, ...
2121

2222
[![front-page-topics](/resources/img/blog/scaladex/front-page-topics.png)](/resources/img/blog/scaladex/front-page-topics.png)
2323
*Topics for projects on the front page of Scaladex*
@@ -39,9 +39,9 @@ Here's some more info about how each piece of contributing info gets set:
3939
- chatroom - auto-populated to a project's gitter room if it has one
4040
- contributing guide - auto-populated to a project's CONTRIBUTING.md if it has one
4141

42-
As an example, the [Scaladex project](https://github.com/scalacenter/scaladex) (for the code behind the website) uses the label "low-hanging fruit" to mark beginner-friendly issues in Github so this label can be set by the maintainer in the edit project page and all the [issues with this label](https://github.com/scalacenter/scaladex/labels/low-hanging%20fruit) will be stored for this project. It also has a [gitter room](https://gitter.im/scalacenter/scaladex) for chatting and a [contributing guide](https://github.com/scalacenter/scaladex/blob/master/CONTRIBUTING.md) which will be auto-populated for the project when all the projects are indexed.
42+
As an example, the [Scaladex project](https://github.com/scalacenter/scaladex) (for the code behind the website) uses the label "low-hanging fruit" to mark beginner-friendly issues in GitHub so this label can be set by the maintainer in the edit project page and all the [issues with this label](https://github.com/scalacenter/scaladex/labels/low-hanging%20fruit) will be stored for this project. It also has a [gitter room](https://gitter.im/scalacenter/scaladex) for chatting and a [contributing guide](https://github.com/scalacenter/scaladex/blob/master/CONTRIBUTING.md) which will be auto-populated for the project when all the projects are indexed.
4343

44-
Scaladex uses Github's GraphQL API to get a project's beginner-friendly issues, see the [Github Topics](#github-topics) section below for more info about Github's GraphQL API. To get a project's contributing guide, Scaladex uses Github's REST API to send a GET request to the Community Profile API which will return links to a project's contributing guide, code of conduct and license. Lastly, to get a project's chatroom, Scaladex generates a URL for a project's gitter room based on the project's repository name and the organization it belongs to (Ex. <https://gitter.im/scalacenter/scaladex>) and checks if that URL exists.
44+
Scaladex uses GitHub's GraphQL API to get a project's beginner-friendly issues, see the [GitHub Topics](#github-topics) section below for more info about GitHub's GraphQL API. To get a project's contributing guide, Scaladex uses GitHub's REST API to send a GET request to the Community Profile API which will return links to a project's contributing guide, code of conduct and license. Lastly, to get a project's chatroom, Scaladex generates a URL for a project's gitter room based on the project's repository name and the organization it belongs to (Ex. <https://gitter.im/scalacenter/scaladex>) and checks if that URL exists.
4545

4646
You can also find Contributing Info on the front page of Scaladex. Now, Scaladex highlights a random subset of projects which have Contributing Info on the front page of Scaladex. It picks a random selection of projects each time the page is loaded to give the same amount of exposure to all projects with Contributing Info. We hope to highlight and better guide potential contributors to projects and issues that are of interest to them!
4747

@@ -54,9 +54,9 @@ The Contributing Search page is similar to the normal search page in Scaladex wh
5454
The code for Contributing Info was committed in 2 pull requests, 1 for the [back-end](https://github.com/scalacenter/scaladex/pull/448) and 1 for the [front-end](https://github.com/scalacenter/scaladex/pull/467).
5555

5656
### Challenge
57-
One interesting challenge I ran into was filtering a project's issues based on a search term. For example, say a user is searching for all issues related to documentation so they enter "docs" as a search term in the Contributing Search page. A project called akka-http has some beginner-friendly issues, one of which is related to documentation with the title "#22874 - Add examples to Sink.actorRefWithAck and Source.queue docs". Since this is the only issue for akka-http that has "docs" in it's title, it should be the only issue that shows up for akka-http in the search results.
57+
One interesting challenge I ran into was filtering a project's issues based on a search term. For example, say a user is searching for all issues related to documentation so they enter "docs" as a search term in the Contributing Search page. A project called akka-http has some beginner-friendly issues, one of which is related to documentation with the title "#22874 - Add examples to Sink.actorRefWithAck and Source.queue docs". Since this is the only issue for akka-http that has "docs" in its title, it should be the only issue that shows up for akka-http in the search results.
5858

59-
All the projects in Scaladex are stored in an [elasticsearch index](https://www.elastic.co/blog/what-is-an-elasticsearch-index) which is like a database in a relational database. Each project stored in elasticsearch has the following fields:
59+
All the projects in Scaladex are stored in an [Elasticsearch index](https://www.elastic.co/blog/what-is-an-elasticsearch-index) which is like a database in a relational database. Each project stored in Elasticsearch has the following fields:
6060
```
6161
name: Text
6262
description: Text
@@ -69,24 +69,24 @@ github: Object
6969
title: Text
7070
...
7171
```
72-
Each project has a `github` field of type `Object` containing Github info like a project's readme and it's number of commits. The `github` field has a `beginnerIssues` field which is a list of a project's beginner-friendly issues. The `beginnerIssues` field is of type Nested, which is a special version of the `Object` type used for lists of `Object`s. Each issue in `beginnerIssues` is of type `Object` and it has a `number` field and a `title` field.
72+
Each project has a `github` field of type `Object` containing GitHub info like a project's readme and its number of commits. The `github` field has a `beginnerIssues` field which is a list of a project's beginner-friendly issues. The `beginnerIssues` field is of type Nested, which is a special version of the `Object` type used for lists of `Object`s. Each issue in `beginnerIssues` is of type `Object` and it has a `number` field and a `title` field.
7373

74-
When Scaladex generates a search query to match the input search term ("docs" from the example above) to an elasticsearch query, all you have to do to match the search term against a project's beginner-friendly issues is add a Nested Query against the `github.beginnerIssues` field and specify you want to match the search term against the issue's `title` field. So this is the Nested Query I added to [DataRepository.scala](https://github.com/scalacenter/scaladex/pull/467/commits/5bcecb58e91c52590e4460189d0415db4d4d2e1f#diff-c5de88d14364dfaadbdecdc462d6c7d1R254) which generates the elasticsearch query:
74+
When Scaladex generates a search query to match the input search term ("docs" from the example above) to an Elasticsearch query, all you have to do to match the search term against a project's beginner-friendly issues is add a Nested Query against the `github.beginnerIssues` field and specify you want to match the search term against the issue's `title` field. So this is the Nested Query I added to [DataRepository.scala](https://github.com/scalacenter/scaladex/pull/467/commits/5bcecb58e91c52590e4460189d0415db4d4d2e1f#diff-c5de88d14364dfaadbdecdc462d6c7d1R254) which generates the Elasticsearch query:
7575
```
7676
nestedQuery("github.beginnerIssues",
7777
termQuery("github.beginnerIssues.title", searchTerm))
7878
```
7979

8080
This sort of worked. It would return the correct projects that have issues matching the search term, but instead of returning only the issues related to the search term, it would return all the issues. So in the example with the "docs" search term, all of akka-http's issues would be returned, not just the one related to documentation.
8181

82-
After looking through the elasticsearch documentation for awhile, I came across Inner Hits which can be used with Nested Queries to select out the nested inner objects that matched the query. So inner hits would return only the beginner-friendly issues that matched the search term. So I updated the code that creates the Nested Query to also extract the inner hits that get returned:
82+
After looking through the Elasticsearch documentation for a while, I came across Inner Hits which can be used with Nested Queries to select out the nested inner objects that matched the query. So inner hits would return only the beginner-friendly issues that matched the search term. So I updated the code that creates the Nested Query to also extract the inner hits that get returned:
8383
```
8484
nestedQuery("github.beginnerIssues",
8585
termQuery("github.beginnerIssues.title", searchTerm))
8686
.inner(innerHits("issues").size(7))
8787
```
8888

89-
And then I added the filtered beginner-friendly issues from inner hits to the project that gets created from the results of the elasticsearch query. I did this by updating the code in [package.scala](https://github.com/scalacenter/scaladex/pull/467/commits/5bcecb58e91c52590e4460189d0415db4d4d2e1f#diff-0aa128fca8ddf4b576663970f7fc4940R39) that reads in each result of the elasticsearch query (`hit`) and converts it to a Scala `Project` object which is used by the server elsewhere.
89+
And then I added the filtered beginner-friendly issues from inner hits to the project that gets created from the results of the Elasticsearch query. I did this by updating the code in [package.scala](https://github.com/scalacenter/scaladex/pull/467/commits/5bcecb58e91c52590e4460189d0415db4d4d2e1f#diff-0aa128fca8ddf4b576663970f7fc4940R39) that reads in each result of the Elasticsearch query (`hit`) and converts it to a Scala `Project` object which is used by the server elsewhere.
9090
```
9191
implicit object ProjectAs extends HitReader[Project] {
9292
override def read(hit: Hit): Either[Throwable, Project] = {
@@ -117,19 +117,19 @@ implicit object ProjectAs extends HitReader[Project] {
117117
}
118118
```
119119

120-
## Github Topics
120+
## GitHub Topics
121121
### How it Works
122122
To categorize projects in Scaladex, the old process was for project maintainers to manually set keywords for their project in Scaladex. Users could then search for projects based on keywords.
123123

124-
Github recently added ["topics"](https://github.com/blog/2309-introducing-topics) to projects stored in Github which are labels that can be set for a project corresponding to categories that a project belongs to. Topics are essentially the same as keywords in Scaladex but maintainers could set them for their project in Github instead of having to do so in Scaladex.
124+
GitHub recently added ["topics"](https://github.com/blog/2309-introducing-topics) to projects stored in GitHub which are labels that can be set for a project corresponding to categories that a project belongs to. Topics are essentially the same as keywords in Scaladex but maintainers could set them for their project in GitHub instead of having to do so in Scaladex.
125125

126-
Topics are part of Github’s new [GraphQL API](https://developer.github.com/v4/) which is meant to eventually replace their old [REST API](https://developer.github.com/v3/). [GraphQL](https://graphql.org/) is a "A query language for your API". It is both a query language and a graph-structured schema which stores data with nodes as objects and edges as relationships between objects. It was developed by Facebook and is different from a traditional REST API by having all API requests go to one route and having a query defined in the request body to specify precisely what data you want.
126+
Topics are part of GitHub’s new [GraphQL API](https://docs.github.com/en/graphql) which is meant to eventually replace their old [REST API](https://docs.github.com/en/rest). [GraphQL](https://graphql.org/) is a "A query language for your API". It is both a query language and a graph-structured schema which stores data with nodes as objects and edges as relationships between objects. It was developed by Facebook and is different from a traditional REST API by having all API requests go to one route and having a query defined in the request body to specify precisely what data you want.
127127

128-
With Github's REST API, you have to make multiple requests to different routes to get project info about multiple projects. And when you make a request, all the data related to that request would be returned. For example, if you wanted to get the most recent 3 issues created for 5 different projects, you would make 5 requests to 5 different routes for each project. Each request would return all the project’s issues. With the GraphQL API, all requests are made to the same route and in the body of the request you input a GraphQL query which specifies exactly what information you want and for which projects. So for the example above of getting the most recent 3 issues created for 5 projects, you would make 1 request to 1 route containing a query to get only the 3 most recent issues for the 5 projects and only those 3 issues for each of the projects would be returned. This results in less requests to Github’s API and less data returned in each response.
128+
With GitHub's REST API, you have to make multiple requests to different routes to get project info about multiple projects. And when you make a request, all the data related to that request would be returned. For example, if you wanted to get the most recent 3 issues created for 5 different projects, you would make 5 requests to 5 different routes for each project. Each request would return all the project’s issues. With the GraphQL API, all requests are made to the same route and in the body of the request you input a GraphQL query which specifies exactly what information you want and for which projects. So for the example above of getting the most recent 3 issues created for 5 projects, you would make 1 request to 1 route containing a query to get only the 3 most recent issues for the 5 projects and only those 3 issues for each of the projects would be returned. This results in less requests to GitHub’s API and less data returned in each response.
129129

130-
So I replaced keywords with topics for projects in Scaladex and used Github’s new GraphQL API to fetch the topics. These topics are fetched for all projects when the server is indexed. A lot more projects have topics than keywords (which had to manually be set by maintainers in Scaladex), so this greatly improved the ability to search for projects based on categories in Scaladex since there are a lot more projects with categories.
130+
So I replaced keywords with topics for projects in Scaladex and used GitHub’s new GraphQL API to fetch the topics. These topics are fetched for all projects when the server is indexed. A lot more projects have topics than keywords (which had to manually be set by maintainers in Scaladex), so this greatly improved the ability to search for projects based on categories in Scaladex since there are a lot more projects with categories.
131131

132-
Here's the code I added to [GithubDownload.scala](https://github.com/scalacenter/scaladex/commit/a771d7a70fdb7aaa0003abf48aaa87a622d89f03#diff-e03c541cf1bd7ec0322a9a6571160bebR339) which contains the GraphQL query that is put in the POST body of the request sent to Github's GraphQL API to fetch topics for a project. You can see the graph-structure of GraphQL in the query. The query first gets a `repository` node and then accesses it's topics through the `repositoryTopics` edge/connection. Then it selects the names of the topics belonging to that repository.
132+
Here's the code I added to [GithubDownload.scala](https://github.com/scalacenter/scaladex/commit/a771d7a70fdb7aaa0003abf48aaa87a622d89f03#diff-e03c541cf1bd7ec0322a9a6571160bebR339) which contains the GraphQL query that is put in the POST body of the request sent to GitHub's GraphQL API to fetch topics for a project. You can see the graph-structure of GraphQL in the query. The query first gets a `repository` node and then accesses its topics through the `repositoryTopics` edge/connection. Then it selects the names of the topics belonging to that repository.
133133
```
134134
private def topicQuery(repo: GithubRepo): JsObject = {
135135
@@ -153,7 +153,7 @@ private def topicQuery(repo: GithubRepo): JsObject = {
153153
)
154154
}
155155
```
156-
If you run that query for the [akka](https://github.com/akka/akka) project, this is what gets returned from the Github API:
156+
If you run that query for the [akka](https://github.com/akka/akka) project, this is what gets returned from the GitHub API:
157157
```
158158
{
159159
"data": {
@@ -203,7 +203,7 @@ If you run that query for the [akka](https://github.com/akka/akka) project, this
203203
```
204204

205205
### Code
206-
The code for Github Topics was committed in [one pull request](https://github.com/scalacenter/scaladex/pull/421).
206+
The code for GitHub Topics was committed in [one pull request](https://github.com/scalacenter/scaladex/pull/421).
207207

208208
## Closing Remarks
209209
Huge thanks to my mentor Heather Miller who was very approachable and always took the time to discuss the best way to implement this project.

0 commit comments

Comments
 (0)