SparkSQL write operations fail when using raw JSON

When using SparkSQL, if a user tries to write a dataframe to Elasticsearch, it will fail if the `es.input.json` property is set to `true`.

When writing a Dataframe that would not be compatible with the write mode, the error message ends up as something like:
```
org.elasticsearch.hadoop.serialization.EsHadoopSerializationException: org.codehaus.jackson.JsonParseException: Unexpected character ('(' (code 40)): expected a valid value (number, String, array, object, 'true', 'false' or 'null')
 at [Source: [B@25a91b8c; line: 1, column: 2]
```
or if using a Dataset[String] (which would reasonably be expected to work), the exception looks like:
```
org.elasticsearch.hadoop.rest.EsHadoopRemoteException: mapper_parsing_exception: failed to parse;org.elasticsearch.hadoop.rest.EsHadoopRemoteException: not_x_content_exception: Compressor detection can only be called on some xcontent bytes or compressed xcontent bytes
	{&quot;index&quot;:{}}
([{&quot;id&quot;: &quot;id3&quot;, &quot;email&quot;: &quot;email1@gmail.com&quot;}],StructType(StructField(value,StringType,true)))
```

The preferred solution here is to simply use a Spark RDD of type String instead, but it would be best if ES-Hadoop supported Dataset[String] for writing json data, checking for other incompatible situations and throwing appropriate errors.

relates #1280 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

SparkSQL write operations fail when using raw JSON #1303

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

SparkSQL write operations fail when using raw JSON #1303

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions