Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Standardize text normalization #485

Closed
colinodell opened this issue May 24, 2020 · 1 comment · Fixed by #486
Closed

Standardize text normalization #485

colinodell opened this issue May 24, 2020 · 1 comment · Fixed by #486
Labels
enhancement New functionality or behavior implemented Change has been implemented
Milestone

Comments

@colinodell
Copy link
Member

(Pulling #484 (comment) into a separate thread)

There are a few places within this library where text is manipulated into a "normalized" version:

In some cases we want these to be "pluggable" so that users can customize the output to fit their needs; in others we want to use a standardized approach to comply with the spec. But in both cases I think we can use a common interface for normalizing text.

I'd therefore to unify these under a single interface:

<?php

namespace League\CommonMark\Normalizer;

/**
 * Creates a normalized version of the given input text
 */
interface TextNormalizerInterface
{
    /**
     * @param string $text    The text to normalize
     * @param mixed  $context Additional context about the text being normalized (optional)
     */
    public function normalize(string $text, $context = null): string;
}

$context can be used to provide additional information about the text being normalized. For example, when normalizing heading content within the HeadingPermalinkProcessor, we can pass along the Heading node as the $context in case somebody wants to check the heading level when generating a slug. In other cases this may be null.

We'll provide two implementations out-of-the-box:

  1. A TextNormalizer based on Reference::normalizeReference which performs the normalization prescribed by the spec
  2. A SlugNormalizer which works similarly, but converts text to underscore and uses - separators instead of spaces

Existing code will be updated to use these and we'll set relevant deprecation notices where needed.

Furthermore, we'll rename the new heading_permalink/slug_generator option to heading_permalink/slug_normalizer.

The FootnoteExtension will also be modified to use this new normalizer approach, though further research will be needed to determine if we should leverage the spec's reference normalization approach and/or allow customization.

@colinodell colinodell added the enhancement New functionality or behavior label May 24, 2020
@colinodell colinodell added this to the v1.5 milestone May 24, 2020
@colinodell
Copy link
Member Author

The footnote extension classes rely on the Reference implementation, which normalizes strings a certain way, so let's avoid allowing customization for now. We can always revisit this later.

@close-label close-label bot added the implemented Change has been implemented label May 25, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New functionality or behavior implemented Change has been implemented
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant