[SPR-17039] Support stricter encoding of URI variables in UriComponents Created: 13/Jul/18  Updated: 15/Jan/19  Resolved: 19/Jul/18

Status: Closed
Project: Spring Framework
Component/s: Web
Affects Version/s: 5.0.7
Fix Version/s: 5.1 RC1

Type: Improvement Priority: Minor
Reporter: Rossen Stoyanchev Assignee: Rossen Stoyanchev
Resolution: Complete Votes: 1
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depend
is depended on by SPR-17027 HtmlUnitRequestBuilder decodes plus s... Closed
Relate
is related to SPR-17630 UriComponentsBuilder.toUriString() is... Closed
Supersede
supersedes SPR-16860 Spring is inconsistent in the encodin... Resolved
supersedes SPR-16202 Encoding of URI Variables on RestTemp... Resolved
supersedes SPR-16718 UriComponentsBuilder does not encode ... Resolved
Days since last comment: 9 weeks, 6 days ago
Last commented by a User: true
Last updater: Spring Issuemaster

 Description   

Historically UriComponents has always encoded only characters that are illegal in a given part of the URI (e.g. "/" is illegal in a path segment), and that does not include characters that are legal but have some other reserved meaning (e.g. ";" in a path segment, or also "+" in a query param).

UriComponents has also always relied on expanding URI variables first, and then encoding the expanded String, which makes it impossible to apply stricter encoding to URI variable values which is usually what's expected intuitively, because once expanded it's impossible to tell the values apart from the rest of the template. Typically the expectation is that expanded values will have by fully encoded.

While the RestTemplate and WebClient can be configured with a UriBuilderFactory that supports different encoding mode strategy, currently there is really no answer when using UriComponents directly.



 Comments   
Comment by Rossen Stoyanchev [ 17/Jul/18 ]

This is now ready, see the updated "URI Encoding" in the docs. The short version is, use the new UriComponentsBuilder#encode method and not the existing one in UriComponents, i.e. invoke encode before and not after expanding URI variables.

Please give this a try with 5.0.8 or 5.1 snapshots to confirm how it works in your application.

Comment by Christophe Levesque [ 17/Jul/18 ]

Thanks Rossen Stoyanchev! The only downside is that it requires that extra UriComponentsBuilder#encode call.

UriComponentBuilder.fromHttpUrl(url).queryParam("foo", foo).toUriString(); // <= this would still not work, needs to add new encode() after toUriString

Is there a way to change the toUriString method in a way that would have the previous code work as is?

PS: Also, unrelated request: can there be a toUri() shorthand method in UriComponentsBuilder the same way there is a toUriString()?

Comment by Rossen Stoyanchev [ 17/Jul/18 ]

The key to understand this, is that different degrees of encoding are applied to the URI template vs URI variables. In other words given:

http://example.com/a/{b}/c?q={q}&p={p}

The URI template is everything except for the URI variable placeholders. However the code snippet you showed only builds a URI literal without any variables, so the level encoding is the same, only illegal characters, no matter which method is used.

So it would also have to be something like:

.queryParam("foo", "{foo}").buildAndExpand(foo)

By the time .toUriString() is called, the expand would have happened and at that point it's too late to encode URI variables more strictly. Unfortunately we cannot switch the default mode of encoding in UriComponentsBuilder at this stage since that's the only way toUriString() could work without a call to encode().

That said UriComponentsBuilder does have a buildAndExpand shortcut to URI:

URI uri = UriComponentBuilder.fromHttpUrl(url).queryParam("foo", "{foo}").encode().buildAndExpand("a+b");
Comment by Rossen Stoyanchev [ 18/Jul/18 ]

Come to think of it, this works and it's almost identical length as what you had:

URI uri = UriComponentsBuilder
        .fromHttpUrl(url).queryParam("foo", "{foo}").build(foo);

Or include it in the URI template:

URI uri = UriComponentsBuilder
        .fromHttpUrl(url + "?foo={foo}").build(foo);

Explanation: the build(Object...) and build(Map<String, Object>) methods had to be implemented as part of UriBuilder (new in 5.0) but that contract is more likely to be used through DefaultUriBuilderFactory, if at all, and is overall quite new. So I've switched those methods internally to do encode().buildAndExpand().toUri().

For toUriString() a switch from build() + encode() to encode() + build() would make no difference, because no URI variables are expanded. We could add a toUri() as well but that would also have no effect on encoding, i.e. same as toUriString().

Comment by Rossen Stoyanchev [ 19/Jul/18 ]

Resolving but now is a good time to try this.

Comment by Michal Domagala [ 08/Nov/18 ]

I use autoconfigured WebTestClient in my integration test. I was satisfied with EncodingMode.URI_COMPONENT because I could easy test trimming spaces in request argument - I can just add a space to my request {{.queryParam("foo", " bar")) }} and verify space is trimmed on server side.

Could you point me how to elegant undo SPR-17039 for autoconfigured WebTestClient? The only way I found is drop autoconfigured one and create custom client.

Comment by Rossen Stoyanchev [ 09/Nov/18 ]

Michal Domagala, I've create a ticket in Boot. I've also included at least one example there of how it can be done. There may be better ways though so watch for updates on the ticket.

Comment by Leonard Br├╝nings [ 20/Nov/18 ]

Rossen Stoyanchev please advice on how to solve it with generic UriTemplate

    // templates are populated via Spring Boot Configuration Properties
    private Map<String, UriTemplate> templates = new HashMap<>();

    public URI getLink(String linkType, Map<String, String> templateParams) {
        return templates.get(linkType).expand(templateParams);
    }
conf:
  links:
    configSite: "https://example.com/config?email={email}"

At this point we don't know what variable we have, and if they are query or path variables.

URI redirectUri = getLink("configSite", Collections.singletonMap("email", "service+bar@gmail.com"));
// render     https://example.com/config?email=service+bar@gmail.com
// instead of https://example.com/config?email=service%2Bbar%40gmail.com

This will provide a valid URI, however it won't encode the + and @, and then the plus will get decoded to a space on the receiving site. The problem is that we can't even use URLEncoder.encode manually on the calling site, as this will cause double encoding.

Comment by Leonard Br├╝nings [ 21/Nov/18 ]

I managed to get the desired result with:

    public URI getLink(String linkType, Map<String, String> templateParams) {
        return UriComponentsBuilder.fromUriString(templates.get(linkType).toString())
                .encode()
                .buildAndExpand(templateParams).toUri();
    }

However, I must say this is quite ugly. Is there any way we could get UriTemplate.encode().expand()?

Comment by Rossen Stoyanchev [ 21/Nov/18 ]

Not really, UriTemplate uses UriComponentsBuilder internally, and provides just one way of doing it. You could shorten your example to:

public URI getLink(String linkType, Map<String, String> templateParams) {
        return UriComponentsBuilder.fromUriString(templates.get(linkType).toString())
                .build(templateParams);
    }
Comment by Spring Issuemaster [ 14/Jan/19 ]

The Spring Framework has migrated to GitHub Issues. This issue corresponds to spring-projects/spring-framework#21577.

Generated at Mon Mar 25 01:26:38 UTC 2019 using JIRA 7.9.2#79002-sha1:3bb15b68ecd99a30eb364c4c1a393359bcad6278.