JavaScript 正規表達式 Regex

Regex 全名 Regular Expression 正規表達式，又簡稱為 Regex 或 RegExp。

甚麼是 Regular Expression?

在字串或段落中搜尋的一個方法，是用來描述字串符合某個語法規則的模型，可以用來做文字的搜尋、比對、萃取、替代、轉換等等。

怎麼寫?

可以使用兩個 / /或是new RegExp()來建立一個RegExp物件。

建立正規表達式字面值（regular expression literal）。
使用兩個 / 夾住條件

const rule = /[a-z]/;
// [] character set; 像是一個搜尋範圍。
//matches a character in the range "a" to "z"

const regex = /some text/;
//使用literal斜線，這種方式會在script載入時就被編譯，效能較好

新增RegExp物件

var rules = new RegExp('[0-9]')
//matches a character in the range "0" to "9"

const  regex = new RegExp('some text')
//使用 new 建構一個 RegExp 物件，適合用在需要動態產生 pattern 的場合。

加上flag

const regex = /some text/i ; 
// i : 不區分大小寫
const regex = new RegExp('some text', 'g'); 
// g :比對字串所有位置

如何使用?

可以使用RegExp 物件中的tet exec，在String物件中的search、match、replace、split 等方法中，也有支援正規表達式寫法。

使用 test() 測試

var str = 'happy';
// 驗證字串為 a~z 的字母
const rules = /[a-z]/;
console.log(rules.test(str));
// true

console.log(/\d/.test(str));
// false
// \d 的意思是: match any digit character (0-9)

exce() 取得比對的詳細資料

const regex = /hello world/i
regex.exec('Hello World !!') // ["Hello World", index: 0, input: "Hello World !!", groups: undefined]
regex.exec('Hello Regex !!') // null ；　比對失敗時回傳 null

3.search()、match()

const paragraph = 'Lorem Ipsum is simply dummy text of the printing and typesetting industry.'

// 使用 search 搜尋字串是否在段落中，有找到回傳字串的起始位置，沒找到回傳 -1
paragraph.search('tExT') // -1
paragraph.search(/tExT/i) // 28

// 使用 match 找出第一個比對成功的詳細資訊，加上 g flag 則會列出所有比對成功的字串
paragraph.match(/ing/) // ["ing", index: 45, ...]
paragraph.match(/ing/g) // ["ing", "ing"]

RegExp 特殊字元

^ 表示必須在開頭

const Word = 'apple';
Word.match(/^Ap/); //null
Word.match(/^Ap/i); //["ap", index: 0, input: "apple",..]
Word.match(/^pl/); //null

$ 表示必須在結尾

const Word = 'apple';
Word.match(/Ap$/); //null
Word.match(/LE$/i); //["le", index: 3, input: "apple"..]
Word.match(/le$/); //["le", index: 3, input: "apple"..]

| 表示 or

const regex = /color|colour/
regex.exec('color') // ["color", index: 0, ...]
regex.exec('colour') // ["colour", index: 0, ...]

\ 反斜線，跳脫特殊自元

const regex = /\$100/; //$不再表示結尾
regex.test('$100') // true

[] 集合

集合代表著這一個字元可以是 [ ] 內的其中一種。

// 只要是英文大寫字母，就比對成功
const regex = /[ABCDEFGHIJKLMNOPQRSTUVWXYZ]/
'K'.match(regex) // ["K", index: 0, ...]
'δ'.match(regex) // null

// 可以使用 '-' 來簡化集合，'A-Z' 表示英文字母 A ~ Z 都符合
const regex = /[A-Z]/

// 若要比對的是英文或數字，可以這樣表示
const regex = /[A-Za-z0-9]/

一些常用的集合有對應的特殊字元。

const regex = /./   // 比對換行符號外的任意一個字元
const regex = /\d/  // 比對一個數字，相等於 /[0-9]/
const regex = /\w/  // 比對一個英文、數字或底線，相等於 /[A-Za-z0-9_]/
const regex = /\s/  // 比對一個的空格 (ex: space, tab, 換行, ...)

[^ ] 排除

使用排除法 [^ ] 來比對這個集合以外的字元

const regex = /[^\w]/
regex.test('a') // false
regex.test('!') // true

{} 量詞

使用集合一次也只能比對一個文字，若想比對連續的相同規則時，可以使用量詞 { } 來修飾。

// 不使用量詞時，要比對 5 個連續的數字就必須寫 5 次
const regex = /\d\d\d\d\d/
regex.test('12345') // true

// 使用 {5} 表示連續出現 5 次
const regex = /\d{5}/
regex.exec('abcde12345') // ["12345", index: 5, ...]
regex.exec('a1b2c3d4e5') // null

// 使用 {2,} 表示連續出現 2 次以上
const regex = /\w\+{2,}/
regex.exec('a+') // null
regex.exec('a++') // ["a++", index: 0, ...]

// 使用 {2, 5} 表示連續出現 2 ~ 5 次
const regex = /^\w{2,5}!/
regex.exec('Hi!') // ["Hi!", index: 0, ...]
regex.exec('Helloooo!') // null

量詞也有特殊字元可以替代。

// 使用 ? 表示出現 0 或 1 次，等同於 {0,1}
const regex = /\w?/
// 使用 + 表示出現 1 次或以上，等同於 {1,}
const regex = /\w+/
// 使用 * 表示出現 0 次或以上，等同於 {0,}
const regex = /\w*/

使用上，+、?、、{2, 5} 都是屬於 Greedy 量詞，意思是會以連續出現次數越多為優先，相反的，在量詞後面加上一個問號 +?、??、?、{2, 5}? 就變成 Lazy 量詞，意思是以連續出現次數越少為優先。

// '+' 出現的次數越多優先
const regex = /a\+{2,}/
regex.exec('a+++++') // ["a+++++", index: 0, ...]
// '+' 出現的次數越少優先
const regex = /a\+{2,}?/
regex.exec('a+++++') // ["a++", index: 0, ...]

JavaScript 正規表達式 Regex

甚麼是 Regular Expression?

怎麼寫?

如何使用?

RegExp 特殊字元

^ 表示必須在開頭

$ 表示必須在結尾

| 表示 or

\ 反斜線，跳脫特殊自元

[] 集合

[^ ] 排除

{} 量詞

hoyi-23

Related Posts

Comments

甚麼是 Regular Expression?

怎麼寫?

如何使用?

RegExp 特殊字元

^ 表示必須在開頭

$ 表示必須在結尾

| 表示 or

\ 反斜線，跳脫特殊自元

[] 集合

[^ ] 排除

{} 量詞

hoyi-23

Related Posts

開啟測試這扇門的鑰匙

DAY3：Jaden Casting Strings

Contextual Data Augmentation

Comments