Skip to content

Commit

Permalink
Expand on strings
Browse files Browse the repository at this point in the history
See #21
  • Loading branch information
flaviut committed Apr 27, 2015
1 parent 9c62e0e commit 0fd94db
Showing 1 changed file with 12 additions and 6 deletions.
18 changes: 12 additions & 6 deletions content/strings.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ echo """
proc re(s: string): string = s
echo r" "" "
echo r".""."
echo re"\b[a-z]++\b"
```
``` console
Expand All @@ -29,17 +29,23 @@ words words words ⚑
<body>
<body/>
<html/>
"
.".
\b[a-z]++\b
```

There are several types of strings literals:
There are several types of string literals:

- Quoted Strings: Created by wrapping the body in triple quotes, they never interpret escape codes
- Raw Strings: created by prefixing the string with an `r`. There are no escape sequences don't work, except for `"`, which can be escaped as `""`
- Proc Strings: raw strings, but the method name that prefixes the string is called
- Raw Strings: created by prefixing the string with an `r`. They do not interpret escape sequences, except for `""`, which is interpreted as `"`. This means that `r"\b[a-z]\b"` is interpreted as `\b[a-z]\b` instead of failing to compile with a syntax error.
- Proc Strings: raw strings, but the method name that prefixes the string is called, so that `foo"12\"` -> `foo(r"12\")`.

Strings are null-terminated, so that `cstring("foo")` requires zero copying. However, you should be careful that the lifetime of the cstring does not exceed the lifetime of the string it is based upon.

Strings can also almost be thought of as `seq[char]` with respect to assignment semantics. See [seqs][]

[seqs]: /seqs/#immutability

## A note about unicode
Unicode symbols are allowed in strings, but are not treated in any special way, so if you want count glyphs or uppercase unicode symbols, you must use the `unicode` module.

Strings are generally considered to be encoded as UTF-8, so because of unicode's backwards compatibility, can be treated exactly as ASCII, with all values above 127 ignored.
Strings are generally considered to be encoded as UTF-8, so because of unicode's backwards compatibility, can be treated exactly as ASCII, with all values above 127 ignored.

0 comments on commit 0fd94db

Please sign in to comment.