[flaviocopes] The Advanced JavaScript Course - JavaScript Standard Library - part 2

Replacing using Regular Expressions

We already saw how to check if a string contains a pattern.

We also saw how to extract parts of a string to an array, matching a pattern.

Let’s see how to replace parts of a string based on a pattern.

The String object in JavaScript has a replace() method, which can be used without regular expressions to perform a single replacement on a string:

"Hello world!".replace('world', 'dog') //Hello dog!
"My dog is a good dog!".replace('dog', 'cat') //My cat is a good dog!

This method also accepts a regular expression as argument:

"Hello world!".replace(/world/, 'dog') //Hello dog!

Using the g flag is the only way to replace multiple occurrences in a string in vanilla JavaScript:

"My dog is a good dog!".replace(/dog/g, 'cat') //My cat is a good cat!

Groups let us do more fancy things, like moving around parts of a string:

"Hello, world!".replace(/(\w+), (\w+)!/, '$2: $1!!!')
// "world: Hello!!!"

Instead of using a string you can use a function, to do even fancier things. It will receive a number of arguments like the one returned by String.match(RegExp) or RegExp.exec(String) , with a number of arguments that depends on the number of groups:

"Hello, world!".replace(/(\w+), (\w+)!/, (matchedString, first, second) => {
  console.log(first);
  console.log(second);

  return `${second.toUpperCase()}: ${first}!!!`
})
//"WORLD: Hello!!!"

Greediness

Regular expressions are said to be greedy by default.

What does it mean?

Take this regex

/\$(.+)\s?/

It is supposed to extract a dollar amount from a string

/\$(.+)\s?/.exec('This costs $100')[1]
//100

but if we have more words after the number, it freaks off

/\$(.+)\s?/.exec('This costs $100 and it is less than $200')[1]
//100 and it is less than $200

Why? Because the regex after the $ sign matches any character with .+ , and it won’t stop until it reaches the end of the string. Then, it finishes off because \s? makes the ending space optional.

To fix this, we need to tell the regex to be lazy, and perform the least amount of matching possible. We can do so using the ? symbol after the quantifier:

/\$(.+?)\s/.exec('This costs $100 and it is less than $200')[1]
//100

I removed the ? after \s otherwise it matched only the first number, since the space was optional

So, ? means different things based on its position, because it can be both a quantifier and a lazy mode indicator.

Lookaheads: match a string depending on what follows it

Use ?= to match a string that’s followed by a specific substring:

/Roger(?=Waters)/

/Roger(?= Waters)/.test('Roger is my dog') //false
/Roger(?= Waters)/.test('Roger is my dog and Roger Waters is a famous musician') //true

?! performs the inverse operation, matching if a string is not followed by a specific substring:

/Roger(?!Waters)/

/Roger(?! Waters)/.test('Roger is my dog') //true
/Roger(?! Waters)/.test('Roger Waters is a famous musician') //false

Lookbehinds: match a string depending on what precedes it

This is an ES2018 feature.

Lookaheads use the ?= symbol. Lookbehinds use ?<= .

/(?<=Roger) Waters/

/(?<=Roger) Waters/.test('Pink Waters is my dog') //false
/(?<=Roger) Waters/.test('Roger is my dog and Roger Waters is a famous musician') //true

A lookbehind is negated using ?<! :

/(?<!Roger) Waters/

/(?<!Roger) Waters/.test('Pink Waters is my dog') //true
/(?<!Roger) Waters/.test('Roger is my dog and Roger Waters is a famous musician') //false

Regular Expressions and Unicode

The u flag is mandatory when working with Unicode strings, in particular when you might need to handle characters in astral planes, the ones that are not included in the first 1600 Unicode characters.

Like Emojis, for example, but not just those.

If you don’t add that flag, this simple regex that should match one character will not work, because for JavaScript that emoji is represented internally by 2 characters:

/^.$/.test('a') //✅
/^.$/.test('🐶') //❌
/^.$/u.test('🐶') //✅

So, always use the u flag.

Unicode, just like normal characters, handle ranges:

/[a-z]/.test('a')  //✅
/[1-9]/.test('1')  //✅

/[🐶-🦊]/u.test('🐺')  //✅
/[🐶-🦊]/u.test('🐛')  //❌

JavaScript checks the internal code representation, so :dog: < :wolf: < :fox_face: because \u1F436 < \u1F43A < \u1F98A . Check the full Emoji list to get those codes, and to find out the order (tip: the macOS Emoji picker has some emojis in a mixed order, don’t count on it)

Unicode property escapes

As we saw above, in a regular expression pattern you can use \d to match any digit, \s to match any character that’s not a white space, \w to match any alphanumeric character, and so on.

Unicode property escapes is an ES2018 feature that introduces a very cool feature, extending this concept to all Unicode characters introducing \p{} and is negation \P{} .

Any Unicode character has a set of properties. For example, Script determines the language family, ASCII is a boolean that’s true for ASCII characters, and so on. You can put this property in the graph parentheses, and the regex will check for that to be true:

/^\p{ASCII}+$/u.test('abc')   //✅
/^\p{ASCII}+$/u.test('ABC@')  //✅
/^\p{ASCII}+$/u.test('ABC🙃') //❌

ASCII_Hex_Digit is another boolean property, that checks if the string only contains valid hexadecimal digits:

/^\p{ASCII_Hex_Digit}+$/u.test('0123456789ABCDEF') //✅
/^\p{ASCII_Hex_Digit}+$/u.test('h')                //❌

There are many other boolean properties, which you just check by adding their name in the graph parentheses, including Uppercase , Lowercase , White_Space , Alphabetic , Emoji and more:

/^\p{Lowercase}$/u.test('h') //✅
/^\p{Uppercase}$/u.test('H') //✅

/^\p{Emoji}+$/u.test('H')   //❌
/^\p{Emoji}+$/u.test('🙃🙃') //✅

In addition to those binary properties, you can check any of the unicode character properties to match a specific value. In this example, I check if the string is written in the greek or latin alphabet:

/^\p{Script=Greek}+$/u.test('ελληνικά') //✅
/^\p{Script=Latin}+$/u.test('hey') //✅

Read more about all the properties you can use directly on the TC39 proposal.

Examples

Extract a number from a string

Supposing a string has only one number you need to extract, /\d+/ should do it:

'Test 123123329'.match(/\d+/)
// Array [ "123123329" ]

Match an email address

A simplistic approach is to check non-space characters before and after the @ sign, using \S :

/(\S+)@(\S+)\.(\S+)/

/(\S+)@(\S+)\.(\S+)/.exec('copesc@gmail.com')
//["copesc@gmail.com", "copesc", "gmail", "com"]

This is a simplistic example however, as many invalid emails are still satisfied by this regex.

Capture text between double quotes

Suppose you have a string that contains something in double quotes, and you want to extract that content.

The best way to do so is by using a capturing group , because we know the match starts and ends with " , and we can easily target it, but we also want to remove those quotes from our result.

We’ll find what we need in result[1] :

const hello = 'Hello "nice flower"'
const result = /"([^']*)"/.exec(hello)
//Array [ "\"nice flower\"", "nice flower" ]

Get the content inside an HTML tag

For example, get the content inside a span tag, allowing any number of arguments inside the tag:

/<span\b[^>]*>(.*?)<\/span>/

/<span\b[^>]*>(.*?)<\/span>/.exec('test')
// null
/<span\b[^>]*>(.*?)<\/span>/.exec('<span>test</span>')
// ["<span>test</span>", "test"]
/<span\b[^>]*>(.*?)<\/span>/.exec('<span class="x">test</span>')
// ["<span class="x">test</span>", "test"]

Set and Map

JavaScript Sets

A Set data structure allows to add data to a container.

ECMAScript 6 (also called ES2015) introduced the Set data structure to the JavaScript world, along with Map

A Set is a collection of objects or primitive types (strings, numbers or booleans), and you can think of it as a Map where values are used as map keys, with the map value always being a boolean true.

Initialize a Set

A Set is initialized by calling:

const s = new Set()

Add items to a Set

You can add items to the Set by using the add method:

s.add('one')
s.add('two')

A set only stores unique elements, so calling s.add('one') multiple times won’t add new items.

You can’t add multiple elements to a set at the same time. You need to call add() multiple times.

Check if an item is in the set

Once an element is in the set, we can check if the set contains it:

s.has('one') //true
s.has('three') //false

Delete an item from a Set by key

Use the delete() method:

s.delete('one')

Determine the number of items in a Set

Use the size property:

s.size

Delete all items from a Set

Use the clear() method:

s.clear()

Iterate the items in a Set

Use the keys() or values() methods - they are equivalent:

for (const k of s.keys()) {
  console.log(k)
}

for (const k of s.values()) {
  console.log(k)
}

The entries() method returns an iterator, which you can use like this:

const i = s.entries()
console.log(i.next())

calling i.next() will return each element as a { value, done = false } object until the iterator ends, at which point done is true .

You can also use the forEach() method on the set:

s.forEach(v => console.log(v))

or you can just use the set in a for…of loop:

for (const k of s) {
  console.log(k)
}

Initialize a Set with values

You can initialize a Set with a set of values:

const s = new Set([1, 2, 3, 4])

Convert to array

Convert the Set keys into an array

const a = [...s.keys()]

// or

const a = [...s.values()]

A WeakSet

A WeakSet is a special kind of Set.

In a Set, items are never garbage collected. A WeakSet instead lets all its items be freely garbage collected. Every key of a WeakSet is an object. When the reference to this object is lost, the value can be garbage collected.

Here are the main differences:

  1. you cannot iterate over the WeakSet
  2. you cannot clear all items from a WeakSet
  3. you cannot check its size

A WeakSet is generally used by framework-level code, and only exposes these methods:

  • add()
  • has()
  • delete()

JavaScript Maps

What is a Map

A Map data structure allows to associate data to a key.

Before ES6

ECMAScript 6 (also called ES2015) introduced the Map data structure to the JavaScript world, along with Set.

Before its introduction, people generally used objects as maps, by associating some object or value to a specific key value:

const car = {}
car['color'] = 'red'
car.owner = 'Flavio'
console.log(car['color']) //red
console.log(car.color) //red
console.log(car.owner) //Flavio
console.log(car['owner']) //Flavio

Enter Map

ES6 introduced the Map data structure, providing us a proper tool to handle this kind of data organization.

A Map is initialized by calling:

const m = new Map()

Add items to a Map

You can add items to the map by using the set method:

m.set('color', 'red')
m.set('age', 2)

Get an item from a map by key

And you can get items out of a map by using get :

const color = m.get('color')
const age = m.get('age')

Delete an item from a map by key

Use the delete() method:

m.delete('color')

Delete all items from a map

Use the clear() method:

m.clear()

Check if a map contains an item by key

Use the has() method:

const hasColor = m.has('color')

Find the number of items in a map

Use the size property:

const size = m.size

Initialize a map with values

You can initialize a map with a set of values:

const m = new Map([['color', 'red'], ['owner', 'Flavio'], ['age', 2]])

Map keys

Just like any value (object, array, string, number) can be used as the value of the key-value entry of a map item, any value can be used as the key , even objects.

If you try to get a non-existing key using get() out of a map, it will return undefined .

Weird situations you’ll almost never find in real life

const m = new Map()
m.set(NaN, 'test')
m.get(NaN) //test
const m = new Map()
m.set(+0, 'test')
m.get(-0) //test

Iterating over a map

Iterate over map keys

Map offers the keys() method we can use to iterate on all the keys:

for (const k of m.keys()) {
  console.log(k)
}

Iterate over map values

The Map object offers the values() method we can use to iterate on all the values:

for (const v of m.values()) {
  console.log(v)
}

Iterate over map key, value pairs

The Map object offers the entries() method we can use to iterate on all the values:

for (const [k, v] of m.entries()) {
  console.log(k, v)
}

which can be simplified to

for (const [k, v] of m) {
  console.log(k, v)
}

Convert to array

Convert the map keys into an array

const a = [...m.keys()]

Convert the map values into an array

const a = [...m.values()]

WeakMap

A WeakMap is a special kind of map.

In a map object, items are never garbage collected. A WeakMap instead lets all its items be freely garbage collected. Every key of a WeakMap is an object. When the reference to this object is lost, the value can be garbage collected.

Here are the main differences:

  1. you cannot iterate over the keys or values (or key-values) of a WeakMap
  2. you cannot clear all items from a WeakMap
  3. you cannot check its size

A WeakMap exposes those methods, which are equivalent to the Map ones:

  • get(k)
  • set(k, v)
  • has(k)
  • delete(k)

The use cases of a WeakMap are less evident than the ones of a Map, and you might never find the need for them, but essentially it can be used to build a memory-sensitive cache that is not going to interfere with garbage collection, or for careful encapsualtion and information hiding.

Errors

In this lesson I want to describe the errors in JavaScript. Those errors are raised when an exception happens. We’ll talk about extensions soon.

We have 8 error objects:

  • Error
  • EvalError
  • RangeError
  • ReferenceError
  • SyntaxError
  • TypeError
  • URIError

Let’s analyze each one of those

Error

This is the generic error, and it’s the one all the other error objects inherit from. You will never see an instance of Error directly, but rather JavaScript fires one of the other errors listed above, which inherit from Error .

It contains 2 properties:

  • message : the error description, a human readable message that should explain what error happened
  • name : the type of error occurred (assumes the value of the specific error object name, for example, TypeError or SyntaxError )

and provides just one method, toString() , which is responsible for generating a meaningful string from the error, which can be used to print it to screen.

EvalError

This error is defined in modern JavaScript but never actually thrown by JavaScript, and remains for compatibility purposes. It was defined in ECMAScript 3 but it’s not present in the standard since ECMAScript 5.1.

It was used to indicate that the global function eval() was used incorrectly, in a way incompatible with its definition.

RangeError

A RangeError will fire when a numeric value is not in its range of allowed values.

The simplest example is when you set an array length to a negative value:

[].length = -1 //RangeError: Invalid array length

or when you set it to a number higher than 4294967295

[].length = 4294967295 //4294967295
[].length = 4294967296 //RangeError: Invalid array length

(this magic number is specified in the JavaScript spec as the maximum range of a 32-bit unsigned integer, equivalent to Math.pow(2, 32) - 1 )

Here are the most common range errors you can spot in the wild:

ReferenceError

A ReferenceError indicates that an invalid reference value has been detected: a JavaScript program is trying to read a variable that does not exist.

dog //ReferenceError: dog is not defined
dog = 2 //ReferenceError: dog is not defined

Be aware that the above statement will create a dog variable on the global object if not ran in strict mode .

Here are the most common reference errors you can spot in the wild:

SyntaxError

A SyntaxError is raised when a syntax error is found in a program.

Here are some examples of code that generate a syntax error.

A function statement without name:

function() {
  return 'Hi!'
}
//SyntaxError: function statement requires a name

Missing comma after an object property definition:

const dog = {
  name: 'Roger'
  age: 5
}
//SyntaxError: missing } after property list

Here are the most common syntax errors you can spot in the wild:

TypeError

A TypeError happens when a value has a type that’s different than the one expected.

The simplest example is trying to invoke a number:

1() //TypeError: 1 is not a function

Here are the most common type errors you can spot in the wild:

URIError

This error is raised when calling one of the global functions that work with URIs:

  • decodeURI()
  • decodeURIComponent()
  • encodeURI()
  • encodeURIComponent()

and passing an invalid URI.

The global object

JavaScript provides a global object which has a set of properties, functions and objects that are accessed globally, without a namespace.

The properties are

  • Infinity
  • NaN
  • undefined

and the functions are

  • decodeURI()
  • decodeURIComponent()
  • encodeURI()
  • encodeURIComponent()
  • eval()
  • isFinite()
  • isNaN()
  • parseFloat()
  • parseInt()

The objects are the ones you already saw before, which are part of the standard library:

  • Array
  • Boolean
  • Date
  • Function
  • JSON
  • Math
  • Number
  • Object
  • RegExp
  • String
  • Symbol

and errors.

Let’s now describe here the global properties and functions.

Infinity

Infinity in JavaScript is a value that represents infinity .

Positive infinity. To get negative infinity, use the operator: -Infinity .

Those are equivalent to Number.POSITIVE_INFINITY and Number.NEGATIVE_INFINITY .

Adding any number to Infinity , or multiplying Infinity for any number, still gives Infinity .

NaN

The global NaN value is an acronym for Not a Number . It’s returned by operations such as zero divided by zero, invalid parseInt() operations, or other operations.

parseInt() //NaN
parseInt('a') //NaN
0/0 //NaN

A special thing to consider is that a NaN value is ever equal to another NaN value. You must use the isNaN() global function to check if a value evaluates to NaN :

NaN === NaN //false
0/0 === NaN //false
isNaN(0/0) //true

undefined

The global undefined property holds the primitive value undefined .

Running a function that does not specify a return value returns undefined :

const test = () => {}
test() //undefined

Unlike NaN , we can compare an undefined value with undefined , and get true:

undefined === undefined

It’s common to use the typeof operator to determine if a variable is undefined:

if (typeof dog === 'undefined') {

}

decodeURI()

Performs the opposite operation of encodeURI()

decodeURIComponent()

Performs the opposite operation of encodeURIComponent()

encodeURI()

This function is used to encode a complete URL. It does encode all characters to their HTML entities except the ones that have a special meaning in a URI structure, including all characters and digits, plus those special characters:

~!@#DISCOURSE_PLACEHOLDER_58*()=:/,;?+-_.

Example:

encodeURI("http://flaviocopes.com/ hey!/")
//"http://flaviocopes.com/%20hey!/"

encodeURIComponent()

Similar to encodeURI() , encodeURIComponent() is meant to have a different job.

Instead of being used to encode an entire URI, it encodes a portion of a URI.

It does encode all characters to their HTML entities except the ones that have a special meaning in a URI structure, including all characters and digits, plus those special characters:

-_.!~*'()

Example:

encodeURIComponent("http://www.example.org/a file with spaces.html")
// "http%3A%2F%2Fflaviocopes.com%2F%20hey!%2F"

eval()

This is a special function that takes a string that contains JavaScript code, and evaluates / runs it.

This function is very rarely used and for a reason: it can be dangerous.

I recommend to read this article on the subject.

isFinite()

Returns true if the value passed as parameter is finite.

isFinite(1) //true
isFinite(Number.POSITIVE_INFINITY) //false
isFinite(Infinity) //false

isNaN()

Returns true if the value passed as parameter evaluates to NaN .

isNaN(NaN) //true
isNaN(Number.NaN) //true
isNaN('x') //true
isNaN(2) //false
isNaN(undefined) //true

This function is very useful because a NaN value is never equal to another NaN value. You must use the isNaN() global function to check if a value evaluates to NaN :

0/0 === NaN //false
isNaN(0/0) //true

parseFloat()

Like parseInt() , parseFloat() is used to convert a string value into a number, but retains the decimal part:

parseFloat('10,000', 10) //10     ❌
parseFloat('10.00', 10) //10     ✅ (considered decimals, cut)
parseFloat('10.000', 10) //10     ✅ (considered decimals, cut)
parseFloat('10.20', 10) //10.2     ✅ (considered decimals)
parseFloat('10.81', 10) //10.81     ✅ (considered decimals)
parseFloat('10000', 10) //10000  ✅

parseInt()

This function is used to convert a string value into a number.

Another good solution for integers is to call the parseInt() function:

const count = parseInt('1234', 10) //1234

Don’t forget the second parameter, which is the radix, always 10 for decimal numbers, or the conversion might try to guess the radix and give unexpected results.

parseInt() tries to get a number from a string that does not only contain a number:

parseInt('10 lions', 10) //10

but if the string does not start with a number, you’ll get NaN (Not a Number):

parseInt("I'm 10", 10) //NaN

Also, just like Number it’s not reliable with separators between the digits:

parseInt('10,000', 10) //10     ❌
parseInt('10.00', 10) //10     ✅ (considered decimals, cut)
parseInt('10.000', 10) //10     ✅ (considered decimals, cut)
parseInt('10.20', 10) //10     ✅ (considered decimals, cut)
parseInt('10.81', 10) //10     ✅ (considered decimals, cut)
parseInt('10000', 10) //10000  ✅

Quiz

Welcome to the quiz! Try to answer those questions, which cover the topics of this module.

You can also write the question/answer into the Discord chat, to make sure it’s correct - other students or Flavio will check it for you!

  • write the code needed to create a shallow copy of a const car = { color: blue } object
  • write the code needed to compare two objects
  • take the car object above and write the code needed to make its color property read only, not editable
  • Take the const car = { color: blue } object and transform it into a string in the JSON notation. Then parse it back again to an object.
  • Use the Intl functions to format the number 10 so it’s representing a $10 bill, and then format it to represent a 10€ bill
  • Take the string “JavaScript is amazing” and generate a new string with the text “JavaScript”, cutting the unneeded part
  • Write 2 different ways to take the string “JavaScript is amazing” and generate a new string with the text “ama”, cutting the unneeded parts
  • Create a function that returns the maximum value between 4 multiplied 5 times and the square root of 130. All using the Math functions
  • Create a regular expression that extracts a number from a string
  • Create a regular expression that checks if a string contains the word “dog”
  • Create a regular expression that counts the number of times the word “dog” appears
  • write a function that accepts a string as input, checks if there is a number in the string, and extracts it as a number value
  • given the list of syntax errors, do you remember seeing some of them appear in your code? If not, which ones do you think you will encounter the most during your job?
  • what is Infinity in JavaScript?