Class Text
- Author:
- knoxg
-
Nested Class Summary
Nested Classes -
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final int
Center-justification constant for use in thepad(String, int, int)
methodstatic final int
Left-justification constant for use in thepad(String, int, int)
methodstatic final int
Right-justification constant for use in thepad(String, int, int)
methodstatic Pattern
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionstatic int
compareNatural
(Collator collator, String s, String t) Compares two strings using the current locale's rules and comparing contained numbers based on their numeric values.static String
createEscapedPath
(String[] pathComponents) Escapes the components of a path String, returning an escaped full path String.static char[]
encodeBase64
(byte[] in) Encodes a byte array into Base64 format.static String
Encodes a string into Base64 format.static String
Returns the CSS-escaped form of a string.static String
Returns the csv-escaped form of a string.static String
escapeHtml
(String string) Returns the HTML-escaped form of a string.static String
escapeJava
(String string) Returns a java-escaped string.static String
escapeJavascript
(String string) Returns a javascript string.static String
escapeJavascript2
(String string) Deprecated.static String
escapePathComponent
(String string) Escape a filename or path component.static String
escapePython
(String string) Returns a python string, escaped so that it can be enclosed in a single-quoted string.static String
escapeQueryString
(String unescapedQueryString) Escape this supplied string so it can represent a 'name' or 'value' component on a HTTP queryString.static String
escapeRegex
(String string) Returns a regex-escaped form of a string.static String
getCommonPrefix
(String string1, String string2) Returns the largest common prefix between two other strings; e.g.static String
getDisplayString
(String key, String string) Returns the given string; but will truncate it to MAX_STRING_OUTPUT_CHARS.static String
getDisplayString
(String key, String string, int maxChars) Returns the given string; but will truncate it to MAX_STRING_OUTPUT_CHARS.static String
getFileContents
(File file) Reads a file, and returns its contents in a String.static String
getFileContents
(String filename) Reads a file, and returns its contents in a Stringstatic String
getLastComponent
(String string) Given a period-separated list of components (e.g.static int
Number of character edits between two strings; taken from http://www.merriampark.com/ldjava.htm.static String
Return the md5 hash of a stringstatic Comparator
<String> Returns a comparator that compares contained numbers based on their numeric values and compares other parts using the current locale's order rules.static String
Prefixes every lines supplied with a given indent.static boolean
Returns true if the supplied string is null or the empty string, false otherwisestatic boolean
Returns true if the supplied string is non-null and only contains numeric charactersstatic boolean
isNumericDecimal
(String text) Returns true if the supplied string is non-null and only contains numeric characters or a single decimal point.static boolean
isNumericDecimalExp
(String text) Returns true if the supplied string is non-null and only contains numeric characters or a single decimal point.static String
Return a string composed of a series of strings, separated with the specified delimiterstatic String
Return a string composed of a series of strings, separated with the specified delimiterstatic String
joinWithLast
(Iterable<?> elements, boolean isQuoted, String delimiter, String lastDelimiter) Return a string composed of a series of strings, separated with the specified delimiterstatic String
joinWithLast
(String[] elements, boolean isQuoted, String delimiter, String lastDelimiter) Return a string composed of a series of strings, separated with the specified delimiter.static String
Ensure that a string is padded with spaces so that it meets the required length.static Text.CsvLineReader
Equivalent toparseCsv(text, false);
(i.e.Given a csv-encoded string (as produced by the rules inescapeCsv(String)
, produces a List of Strings which represent the individual values in the string.static String
reduceNewlines
(String input) Ensures that a string returned from a browser (on any platform) conforms to unix line-EOF conventions.static String
Returns a string composed of the supplied text, repeated 0 or more timesstatic String
replaceString
(String originalString, String searchString, String replaceString) An efficient search & replace routine.static String[]
splitEscapedPath
(String escapedPath) Split a path, but allow forward slashes in path components if they're escaped by a preceding '\' character.static String
strDefault
(String strText, String strDefaultText) Utility function to return a default if the supplied string is null.static String
substitutePlaceholders
(Map<?, ?> variables, String text) Perform ${xxxx}-style substitution of placeholders in text.static String
toFirstLower
(String text) Lowercases the first character of a string.static String
toFirstUpper
(String text) Uppercases the first character of a string.static String
unescapeJava
(String string) Unescapes a java-escaped string.static String
unescapePathComponent
(String pathComponent) Unescape a filename or path component.static String
Unescape a HTTP escaped string
-
Field Details
-
JUSTIFICATION_LEFT
Left-justification constant for use in thepad(String, int, int)
method- See Also:
-
JUSTIFICATION_CENTER
Center-justification constant for use in thepad(String, int, int)
method- See Also:
-
JUSTIFICATION_RIGHT
Right-justification constant for use in thepad(String, int, int)
method- See Also:
-
scriptPattern
-
-
Constructor Details
-
Text
public Text()
-
-
Method Details
-
isBlank
Returns true if the supplied string is null or the empty string, false otherwise- Parameters:
text
- The string to test- Returns:
- true if the supplied string is null or the empty string, false otherwise
-
isNumeric
Returns true if the supplied string is non-null and only contains numeric characters- Parameters:
text
- The string to test- Returns:
- true if the supplied string is non-null and only contains numeric characters
-
isNumericDecimal
Returns true if the supplied string is non-null and only contains numeric characters or a single decimal point. The value can have a leading negative ('-') symbol.- Parameters:
text
- The string to test- Returns:
- true if the supplied string is non-null and only contains numeric characters, which may contain a '.' character in there somewhere.
-
isNumericDecimalExp
Returns true if the supplied string is non-null and only contains numeric characters or a single decimal point. The value can have a leading negative ('-') symbol. This version allows exponents ("E+nn" or "E-nn") to the end of the value.- Parameters:
text
- The string to test- Returns:
- true if the supplied string is non-null and only contains numeric characters, which may contain a '.' character in there somewhere.
-
reduceNewlines
Ensures that a string returned from a browser (on any platform) conforms to unix line-EOF conventions. Any instances of consecutive CRs (0xD
) and LFs (0xA
) in a string will be reduced to a series of CRs (the number of CRs will be the maximum number of CRs or LFs found in a row).- Parameters:
input
- the input string- Returns:
- the canonicalised string, as described above
-
escapeHtml
Returns the HTML-escaped form of a string. The&
,<
,>
, and"
characters are converted to&
,<
,>
, and"
respectively.Characters in the unicode control code blocks ( apart from \t, \n and \r ) are converted to &xfffd;
Characters outside of the ASCII printable range are converted into &xnnnn; form
- Parameters:
string
- the string to convert- Returns:
- the HTML-escaped form of the string
-
escapeRegex
Returns a regex-escaped form of a string. That is, the pattern returned by this method, if compiled into a regex, will match the supplied string exactly.- Parameters:
string
- the string to convert- Returns:
- the HTML-escaped form of the string
-
escapeCsv
Returns the csv-escaped form of a string. A csv-escaped string is used when writing to a CSV (comma-separated-value) file. It ensures that commas included within a string are quoted. We use the Microsoft-Excel quoting rules, so that our CSV files can be imported into that. These rules (derived from experimentation) are:- Strings without commas (,) inverted commas ("), or newlines (\n) are returned as-is.
- Otherwise, the string is surrounded by inverted commas, and any inverted commas within the string are doubled-up (i.e. '"' becomes '""').
- A value that starts with any of "=", "@", "+" or "-" has a leading single apostrophe added to prevent the value being evaluated in Excel. The leading quote is visible to the user when the csv is opened, which may mean that it will have to be removed when roundtripping data. This may complicate things if the user actually wants a leading single quote in their CSV value.
Embedded newlines are inserted as-is, as per Excel. This will require some care whilst parsing if we want to be able to read these files.
- Parameters:
string
- the string to convert- Returns:
- the csv-escaped form of the string
-
parseCsv
Given a csv-encoded string (as produced by the rules inescapeCsv(String)
, produces a List of Strings which represent the individual values in the string. Note that this method is *not* equivalent to callingArrays.asList(astring.split(","))
.Setting the whitespaceSensitive parameter to false allows leading and trailing whitespace in *non-quoted* values to be removed, e.g. if the input string
text
is:abc,def, ghi, j k ,"lmn"," op "," q,r","""hello""", "another"
thenparseCsv(text, false)
will return the strings:abc def ghi j k lmn op (this String has one leading space, and a trailing space after 'p') q,r (this String has one leading space) "hello" another
andparseCsv(text, true)
would throw a ParseException (since the final element is a quoted value, but begins with a space). If the, "another"
text is removed, however, thenparseCsv(text, true)
would return the following: andparseCsv(text, true)
will return the stringabc def ghi (this String has two leading spaces) j k (this String has one leading space and a trailing space after the 'k' character) lmn op (this String has one leading space, and a trailing space after 'p') q,r (this String has one leading space) "hello"
Most applications would want to use the 'whiteSpaceSensitive=false' form of this function, since (a) less chance of a ParseException, and (b) it's what an end-user would normally expect. This can be performed by calling the
parseCsv(String)
method.Whitespace is determined by using the
Character.isSpaceChar()
method, which is Unicode-aware.- Parameters:
text
- The CSV-encoded string to parsewhitespaceSensitive
- If set to true, will trim leading and trailing whitespace in *non-quoted* values.- Returns:
- a List of Strings. The returned List is guaranteed to always contain at least one element.
- Throws:
NullPointerException
- if the text passed to this method is nullParseException
- if a quoted value contains leading whitespace before the opening quote, or after the trailing quote.ParseException
- if a quoted value has a start quote, but no end quote, or if a value has additional text after a quoted value (before the next comma or EOL).
-
parseCsv
-
parseCsv
Equivalent toparseCsv(text, false);
(i.e. whitespace-insensitive parsing). Refer to the documentation for that method for more details.- Parameters:
text
- he CSV-encoded string to parse- Returns:
- a List of Strings. The returned List is guaranteed to always contain at least one element.
- Throws:
NullPointerException
- if the text passed to this method is null.ParseException
- seeparseCsv(String, boolean)
for details.- See Also:
-
escapeJava
Returns a java-escaped string. Replaces '"' with '\"'.Since this is predominantly used in the query builder, I am not worrying about unicode sequences (SWIFT is ASCII) or newlines (although this may be necessary later) for multiline textboxes
- Returns:
- The java-escaped version of the string
-
escapeJavascript
Returns a javascript string. The characters'
,"
and\
are converted into their Unicode equivalents,Non-printable characters are converted into unicode equivalents
Newlines are now replaced with "\n"
- Returns:
- The java-escaped version of the string
-
escapeJavascript2
Deprecated.useescapeJavascript(String)
insteadReturns a javascript string. The characters'
,"
and\
are converted into their Unicode equivalents,Non-printable characters are converted into unicode equivalents
- Returns:
- The java-escaped version of the string
-
unescapeJava
Unescapes a java-escaped string. Replaces '\"' with '"', '\\u0022' with '"', '\\u0027' with ''', '\\u005C' with '\'.Since this is predominantly used in the query builder, I am not worrying about unicode sequences (SWIFT is ASCII) or newlines (although this may be necessary later) for multiline textboxes
- Returns:
- The java-escaped version of the string
-
escapePython
Returns a python string, escaped so that it can be enclosed in a single-quoted string.The characters
'
,"
and\
are converted into their Unicode equivalents,Non-printable characters are converted into unicode equivalents
- Returns:
- The python-escaped version of the string
-
escapePathComponent
Escape a filename or path component. Characters that typically have special meanings in paths (":", "/", "\") are escaped with a preceding "\" character. Does not escape glob characters ( "*" or "?" ). Do not use this method to escape a full file path; when escaping a file path, escape each path component separately and then join the components with "/" characters ( seecreateEscapedPath(String[])
).- Parameters:
string
- the filename or path component to escape- Returns:
- the escaped form of the filename (or path component)
-
unescapePathComponent
Unescape a filename or path component. The escape sequences "\\" , "\:" and "\/" are converted to "\", ":" and "/" respectively. All other escape sequences will raise an IllegalArgumentExceptionSee
splitEscapedPath(String)
to split an escaped path into components.- Parameters:
pathComponent
- the filename or path component to unescape- Returns:
- the unescaped form of the filename or path component
- Throws:
IllegalArgumentException
- if an unexpected escape is encountered, or the escape is unclosed
-
splitEscapedPath
Split a path, but allow forward slashes in path components if they're escaped by a preceding '\' character. Individual path components returned by this method will be unescaped.splitPath(null) = NPE splitPath("") = [ "" ] splitPath("abc") = [ "abc" ] splitPath("abc/def/ghi") = [ "abc", "def", "ghi" ] splitPath("abc\\/def/ghi") = [ "abc/def", "ghi" ]
Opposite of
createEscapedPath(String[])
-
createEscapedPath
Escapes the components of a path String, returning an escaped full path String. Each path component is escaped withescapePathComponent(String)
and then joined using '/' characters.Opposite of
splitEscapedPath(String)
.- Parameters:
pathComponents
- the filename components- Returns:
- an escaped path
-
escapeCss
Returns the CSS-escaped form of a string.Characters outside of the printable ASCII range are converted to \nnnn form
- Parameters:
input
- the string to convert- Returns:
- the HTML-escaped form of the string
-
getDisplayString
Returns the given string; but will truncate it to MAX_STRING_OUTPUT_CHARS. If it exceeds this length, a message is appended expressing how many characters were truncated. Strings with the key of 'exception' are not truncated (in order to display full stack traces when these occur). Any keys that contain the text 'password', 'Password', 'credential' or 'Credential' will be returned as eight asterisks.This method is used in the debug JSP when dumping properties to the user, in order to prevent inordinately verbose output.
- Parameters:
key
- The key of the string we wish to displaystring
- The string value- Returns:
- A (possibly truncated) version of this string
-
getDisplayString
Returns the given string; but will truncate it to MAX_STRING_OUTPUT_CHARS. If it exceeds this length, a message is appended expressing how many characters were truncated. Strings with the key of 'exception' are not truncated (in order to display full stack traces when these occur). Any keys that contain the text 'password', 'Password', 'credential' or 'Credential' will be returned as eight asterisks.This method is used in the debug JSP when dumping properties to the user, in order to prevent inordinately verbose output.
- Parameters:
key
- The key of the string we wish to displaystring
- The string valuemaxChars
- The maximum number of characters to display- Returns:
- A (possibly truncated) version of this string
-
strDefault
Utility function to return a default if the supplied string is null. Shorthand for(strText==null) ? strDefaultText : strText;
- Returns:
- strText is strText is not null, otherwise strDefaultText
-
join
Return a string composed of a series of strings, separated with the specified delimiter- Parameters:
elements
- The array of elements to join- Returns:
- delimiter The delimiter to join each string with
- Throws:
NullPointerException
- if elements or delimiter is null
-
join
Return a string composed of a series of strings, separated with the specified delimiter- Parameters:
elements
- A Collection or Iterable of the elements to join- Returns:
- delimiter The delimiter to join each string with
- Throws:
NullPointerException
- if elements or delimiter is null
-
joinWithLast
public static String joinWithLast(String[] elements, boolean isQuoted, String delimiter, String lastDelimiter) Return a string composed of a series of strings, separated with the specified delimiter. Each element is contained in single quotes. The final delimeter can be set to a different value, to produce text in the form"'a', 'b' or 'c'"
or"'a', 'b' and 'c'"
.There is no special handling of values containing quotes; see
escapeCsv(String)
- Parameters:
elements
- The array of elements to joinisQuoted
- If true, each element is surrounded by single quotesdelimiter
- The delimiter to join each string withlastDelimiter
- The delimiter to join the second-last and last elements- Throws:
NullPointerException
- if elements or delimiter is null
-
joinWithLast
public static String joinWithLast(Iterable<?> elements, boolean isQuoted, String delimiter, String lastDelimiter) Return a string composed of a series of strings, separated with the specified delimiterThere is no special handling of values containing quotes; see
escapeCsv(String)
- Parameters:
elements
- A Collection or Iterable containing the elements to joinisQuoted
- If true, each element is surrounded by single quotesdelimiter
- The delimiter to join each string withlastDelimiter
- The delimiter to join the second-last and last elements- Throws:
NullPointerException
- if elements or delimiter is null- See Also:
-
replaceString
public static String replaceString(String originalString, String searchString, String replaceString) An efficient search & replace routine. Replaces all instances of searchString within str with replaceString.- Parameters:
originalString
- The string to searchsearchString
- The string to search forreplaceString
- The string to replace it with
-
getFileContents
Reads a file, and returns its contents in a String- Parameters:
filename
- The file to read- Returns:
- The contents of the string,
- Throws:
IOException
- A problem occurred whilst attempting to read the string
-
getFileContents
Reads a file, and returns its contents in a String. Identical to callinggetFileContents(projectFile.getCanonicalPath())
.- Parameters:
file
- The file to read- Returns:
- The contents of the string,
- Throws:
IOException
IOException
- A problem occurred whilst attempting to read the string
-
indent
Prefixes every lines supplied with a given indent. e.g.indent("\t", "abcd\nefgh")
would return "\tabcd\n\tefgh". If the string ends in a newline, then the return value also ends with a newline.- Parameters:
indentString
- The characters to indent with. Usually spaces or tabs, but could be something like a timestamp.originalString
- The string to indent.- Returns:
- The originalString, with every line (as separated by the newline character) prefixed with indentString.
-
pad
Ensure that a string is padded with spaces so that it meets the required length. If the input string exceeds this length, this it is returned unchanged- Parameters:
inputString
- the string to padlength
- the desired lengthjustification
- a JUSTIFICATION_* constant defining whether left or right justification is required.- Returns:
- a padded string.
-
getLastComponent
Given a period-separated list of components (e.g. variable references ("a.b.c") or classnames), returns the last component. For example, getLastComponent("com.randomnoun.common.util.Text") will return "Text".If component is null, this function returns null.
If component contains no periods, this function returns the original string.
- Parameters:
string
- The string to retrieve the last component from
-
escapeQueryString
Escape this supplied string so it can represent a 'name' or 'value' component on a HTTP queryString. This generally involves escaping special characters into %xx form. Note that this only works for US-ASCII data. -
encodeBase64
Encodes a string into Base64 format. No blanks or line breaks are inserted.- Parameters:
s
- a String to be encoded.- Returns:
- A String with the Base64 encoded data.
-
encodeBase64
Encodes a byte array into Base64 format. No blanks or line breaks are inserted.- Parameters:
in
- an array containing the data bytes to be encoded.- Returns:
- A character array with the Base64 encoded data.
-
getNaturalComparator
Returns a comparator that compares contained numbers based on their numeric values and compares other parts using the current locale's order rules.For example in German locale this will be a comparator that handles umlauts correctly and ignores upper/lower case differences.
- Returns:
A string comparator that uses the current locale's order rules and handles embedded numbers correctly.
-
compareNatural
Compares two strings using the current locale's rules and comparing contained numbers based on their numeric values.
This is probably the best default comparison to use.
If you know that the texts to be compared are in a certain language that differs from the default locale's langage, then get a collator for the desired locale (
Collator.getInstance(java.util.Locale)
) and pass it tocompareNatural(java.text.Collator, String, String)
- Parameters:
s
- first stringt
- second string- Returns:
- zero iff
s
andt
are equal, a value less than zero iffs
lexicographically precedest
and a value larger than zero iffs
lexicographically followst
-
unescapeQueryString
Unescape a HTTP escaped string- Parameters:
s
- The string to be unescaped- Returns:
- the unescaped string.
-
getCommonPrefix
Returns the largest common prefix between two other strings; e.g. getCommonPrefix("abcsomething", "abcsometharg") would be "abcsometh".- Parameters:
string1
- String number onestring2
- String number two- Returns:
- the large common prefix between the two strings
- Throws:
NullPointerException
- is string1 or string2 is null
-
toFirstUpper
Uppercases the first character of a string.- Parameters:
text
- text to modify- Returns:
- the supplied text, with the first character converted to uppercase.
-
toFirstLower
Lowercases the first character of a string.- Parameters:
text
- text to modify- Returns:
- the supplied text, with the first character converted to lowercase.
-
getLevenshteinDistance
Number of character edits between two strings; taken from http://www.merriampark.com/ldjava.htm. There's a version in commongs-lang, apparently, but according to the comments on that page, it uses O(n^2) memory, which can't be good.- Parameters:
s
- string 1t
- string 2- Returns:
- the smallest number of edits required to convert s into t
-
getMD5
Return the md5 hash of a string- Parameters:
text
- text to hash- Returns:
- a hex-encoded version of the MD5 hash
- Throws:
IllegalStateException
- if the java installation in use doesn't know about MD5
-
repeat
Returns a string composed of the supplied text, repeated 0 or more times- Parameters:
text
- text to repeatcount
- number of repetitions- Returns:
- the repeated text
-
substitutePlaceholders
Perform ${xxxx}-style substitution of placeholders in text. Placeholders without values will be left as-is.For example, gives the set of variables:
- abc = def
then the result of
substituteParameters("xxxx${abc}yyyy${def}zzzz")
will be "xxxxdefyyyy${def}zzzz"$
followed by any other character will be left as-is.- Parameters:
variables
- a set of variable names and values, used in the substitutiontext
- the text to be substituted.- Returns:
- text, with placeholders replaced with values in the variables parameter
-
escapeJavascript(String)
instead