All APIs & References¶
Peppa PEG - Ultra lightweight PEG Parser in ANSI C.
MIT License
Copyright (c) 2021 Ju
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
- Author
Ju Lin
- Copyright
MIT
- Date
2021
- See
DEFINES¶
Defines
-
P4_MALLOC¶
The malloc function. By default, it’s
malloc
.
-
P4_FREE¶
The free function. By default, it’s
free
.
-
P4_REALLOC¶
The realloc function. By default, it’s
realloc
.
-
P4_MAJOR_VERSION¶
Major version number.
-
P4_MINOR_VERSION¶
Minor version number.
-
P4_PATCH_VERSION¶
Patch version number.
-
P4_FLAG_NONE¶
No flag.
-
P4_FLAG_SQUASHED¶
When the flag is set, the grammar rule will have squash all children nodes.
node->head
,node->tail
will be NULL.For example, rule b has P4_FLAG_SQUASHED. After parsing, the children nodes d and e are gone:
a ===> a (b) c ===> b c d e
The flag will impact all of the descendant rules.
-
P4_FLAG_LIFTED¶
When the flag is set, the grammar rule will replace the node with its children nodes.
For example, rule b has P4_FLAG_SQUASHED. After parsing, the node (b) is gone, and its children d and e become the children of a:
a ===> a (b) c ===> d e c d e
-
P4_FLAG_TIGHT¶
When the flag is set, the grammar rule will insert no P4_FLAG_SPACED rules inside the sequences and repetitions
This flag only works for the repetition and sequence expressions.
-
P4_FLAG_SCOPED¶
When the flag is set, the effect of SQUASHED and TIGHT are canceled.
Regardless if the ancestor expression has SQUASHED or TIGHT flag, starting from this expression, Peppa PEG will start creating nodes and apply SPACED rules for sequences and repetitions.
-
P4_FLAG_SPACED¶
When the flag is set, the expression will be inserted between every node inside the sequences and repetitions
If there are multiple SPACED expressions, Peppa PEG will iterate through all SPACED expressions.
This flag makes the grammar clean and tidy without inserting whitespace rule everywhere.
-
P4_FLAG_NON_TERMINAL¶
When the flag is set and a sequence/repetition node tree has only one child, the child node will replace the parent.
-
P4_DEFAULT_RECURSION_LIMIT¶
The default recursion limit.
It can be adjusted via function P4_SetRecursionLimit.
-
P4_MAX_RULE_NAME_LEN¶
The maximum length of a rule name.
-
E_WRONG_LIT¶
-
E_BACKREF_OUT_REACHED¶
-
E_BACKREF_TO_SELF¶
-
E_TEXT_TOO_SHORT¶
-
E_OUT_RANGED¶
-
E_INVALID_UNICODE_CHAR¶
-
E_INVALID_UNICODE_PROPERTY¶
-
E_INVALID_UNICODE_CATEGORY¶
-
E_NO_SUCH_RULE¶
-
E_NO_ALTERNATIVE¶
-
E_NO_PROGRESSING¶
-
E_INSUFFICIENT_REPEAT¶
-
E_INSUFFICIENT_REPEAT2¶
-
E_EXCESSIVE_REPEAT¶
-
E_LEFT_RECUR_NO_LIFT¶
-
E_VIOLATE_NEGATIVE¶
-
E_MAX_RECURSION¶
-
E_INVALID_MATCH_CALLBACK¶
-
E_INVALID_ERROR_CALLBACK¶
-
E_NO_EXPR¶
-
E_WRONG_BACKREF¶
-
E_RECURSIVE_BACKREF¶
ENUMS¶
Enums
-
enum P4_ExpressionKind¶
The expression kind.
Values:
-
enumerator P4_Literal¶
Rule: Case-Sensitive Literal, Case-Insensitive Literal.
-
enumerator P4_Range¶
Rule: Range.
-
enumerator P4_UnicodeCategory¶
Rule: Unicode Category.
-
enumerator P4_Reference¶
Rule: Reference.
-
enumerator P4_Positive¶
Rule: Positive.
-
enumerator P4_Negative¶
Rule: Negative.
-
enumerator P4_Sequence¶
Rule: Sequence.
-
enumerator P4_BackReference¶
Rule: Case-Sensitive BackReference, Case-Insensitive BackReference.
-
enumerator P4_Choice¶
Rule: Choice.
-
enumerator P4_Repeat¶
Rule: RepeatMinMax, RepeatMin, RepeatMax, RepeatExact, OnceOrMore, ZeroOrMore, ZeroOrOnce.
-
enumerator P4_Cut¶
Rule: Cut.
-
enumerator P4_LeftRecursion¶
Rule: Left Recursion.
-
enumerator P4_Literal¶
-
enum P4_Error¶
The error code.
Values:
-
enumerator P4_Ok¶
No error is like a bless.
-
enumerator P4_InternalError¶
When there is an internal error. Please raise an issue: https://github.com/soasme/peppapeg/issues.
-
enumerator P4_MatchError¶
When no text is matched.
-
enumerator P4_NameError¶
When no name is resolved.
-
enumerator P4_AdvanceError¶
When the parse gets stuck forever or has reached the end.
-
enumerator P4_MemoryError¶
When out of memory.
-
enumerator P4_ValueError¶
When the given value is of unexpected type.
-
enumerator P4_IndexError¶
When the index is out of boundary.
-
enumerator P4_KeyError¶
When the id is out of the table.
-
enumerator P4_NullError¶
When null is encountered.
-
enumerator P4_StackError¶
When recursion limit is reached.
-
enumerator P4_PegError¶
When the given value is not valid peg grammar.
-
enumerator P4_CutError¶
When the failure occurs after a
@cut
operator.
-
enumerator P4_Ok¶
TYPEDEFS¶
Typedefs
-
typedef uint32_t ucs4_t¶
A single unicode character. Used when libunistring is not installed.
-
typedef uint32_t P4_ExpressionFlag¶
The flag of expression.
-
typedef char *P4_String¶
The C string type in locale encoding, by default utf-8.
-
typedef uint8_t *P4_Utf8¶
The utf-8 string type.
-
typedef void *P4_UserData¶
The reference of user data.
-
typedef void (*P4_UserDataFreeFunc)(P4_UserData)¶
The function to free user data.
-
typedef struct P4_Grammar P4_Grammar¶
The grammar object that holds all grammar rules.
-
typedef struct P4_Expression P4_Expression¶
The grammar rule expression.
-
typedef P4_Error (*P4_MatchCallback)(P4_Grammar*, P4_Expression*, P4_Node*)¶
The callback for a successful match.
-
typedef P4_Error (*P4_ErrorCallback)(P4_Grammar*, P4_Expression*)¶
The callback for a failure match.
-
typedef struct P4_Position P4_Position¶
The position.
P4_Position does not hold a pointer to the string.
Example:
P4_Position pos = { .pos=10, .lineno=1, .offset=2 }; printf("%u..%u\n", pos.lineno, pos.offset);
-
typedef struct P4_RuneRange P4_RuneRange¶
P4_RuneRange specifies a range between two runes.
Example:
P4_RuneRange range = { .lower='a', .upper='z', .stride=1 };
FUNCTIONS¶
Functions
-
P4_String P4_Version(void)¶
Provide the version string for the library.
Example:
P4_String version = P4_Version(); printf("version=%s\n", version);
- Returns
a string like “1.0.0”.
-
size_t P4_ReadEscapedRune(char *text, ucs4_t *rune)¶
Read an escaped single code point (rune) from an UTF-8 string.
For example:
ucs4_t rune; P4_ReadEscapedRune("\\u000D", rune); // 0xd
- Parameters
text – The text.
rune – The rune.
- Returns
The size of rune.
-
void *P4_ConcatRune(void *str, ucs4_t chr, size_t n)¶
Append a rune to str (in-place).
- Parameters
str – The string to be appended.
chr – The rune.
n – The size of rune. Should be 1, 2, 3, 4.
- Returns
The string starting from appended rune.
-
P4_Expression *P4_CreateLiteral(const P4_String literal, bool sensitive)¶
Create a P4_Literal expression.
Example:
// It can match "let", "Let", "LET", etc. P4_Expression* expr = P4_CreateLiteral("let", false); // It can only match "let". P4_Expression* expr = P4_CreateLiteral("let", true);
The object holds a full copy of the literal.
- Parameters
literal – The exact string to match.
sensitive – Whether the string is case-sensitive.
- Returns
A P4_Expression.
-
P4_Error P4_AddLiteral(P4_Grammar *grammar, P4_String name, const P4_String literal, bool sensitive)¶
Add a literal expression as grammar rule.
Example:
P4_AddLiteral(grammar, "R1", "let", true);
- Parameters
grammar – The grammar.
name – The grammar rule name.
literal – The exact string to match.
sensitive – Whether the string is case-sensitive.
- Returns
The error code.
-
P4_Expression *P4_CreateRange(ucs4_t lower, ucs4_t upper, size_t stride)¶
Create a P4_Range expression.
Example:
// It can match 0, 1, 2, 3, 4, 5, 6, 7, 8, 9. P4_Expression* expr = P4_CreateRange('0', '9', 1); // It can match any code points starting from U+4E00 to U+9FCC (CJK unified ideographs block). P4_Expression* expr = P4_CreateRange(0x4E00, 0x9FFF, 1);
- Parameters
lower – The lower bound of UTF-8 rule to match (inclusive).
upper – The upper bound of UTF-8 rule to match (inclusive).
stride – The stride when iterating the characters in range.
- Returns
A P4_Expression.
-
P4_Expression *P4_CreateRanges(size_t count, P4_RuneRange *ranges)¶
Create a P4_Range expression that holds multiple ranges.
Example:
P4_RuneRange alphadigits[] = {{'a', 'Z', 1}, {'0', '9', 1}}; P4_Expression* range = P4_CreateRanges( sizeof(alphadigits) / sizeof(P4_RuneRange), alphadigits );
- Parameters
count – The total number of ranges.
ranges – A list of P4_RuneRange.
- Returns
A P4_Expression.
-
P4_Error P4_AddRange(P4_Grammar *grammar, P4_String name, ucs4_t lower, ucs4_t upper, size_t stride)¶
Add a range expression as grammar rule.
Example:
P4_AddRange(grammar, "R1", '0', '9', 1); P4_AddRange(grammar, "R1", 0x4E00, 0x9FFF, 1);
- Parameters
grammar – The grammar.
name – The grammar rule name.
lower – The lower bound of UTF-8 rule to match (inclusive).
upper – The upper bound of UTF-8 rule to match (inclusive).
stride – The stride when iterating the characters in range.
- Returns
The error code.
-
P4_Error P4_AddRanges(P4_Grammar *grammar, P4_String name, size_t count, P4_RuneRange *ranges)¶
Add sub-ranges expression as grammar rule.
Example:
P4_RuneRange alphadigits[] = {{'a', 'Z', 1}, {'0', '9', 1}}; P4_Error err = P4_AddRanges( grammar, "R1", sizeof(alphadigits) / sizeof(P4_RuneRange), alphadigits );
- Parameters
grammar – The grammar.
name – The grammar rule name.
count – The total number of ranges.
ranges – A list of P4_RuneRange.
- Returns
The error code.
-
P4_Expression *P4_CreateReference(P4_String reference)¶
Create a P4_Reference expression.
Example:
P4_Expression* expr = P4_CreateReference("entry");
- Parameters
reference – The grammar rule name.
- Returns
A P4_Expression.
-
P4_Error P4_AddReference(P4_Grammar *grammar, P4_String name, P4_String ref_name)¶
Add a reference expression as grammar rule.
Example:
P4_AddReference(grammar, "R1", 2);
- Parameters
grammar – The grammar.
name – The grammar rule name.
ref_name – The name of referenced grammar rule.
- Returns
The error code.
-
P4_Expression *P4_CreatePositive(P4_Expression *refexpr)¶
Create a P4_Positive expression.
Example:
// If the following text includes "let", the match is successful. P4_Expression* expr = P4_CreatePositive(P4_CreateLiteral("let", true));
- Parameters
refexpr – The positive pattern to check.
- Returns
A P4_Expression.
-
P4_Error P4_AddPositive(P4_Grammar *grammar, P4_String name, P4_Expression *refexpr)¶
Add a positive expression as grammar rule.
Example:
P4_AddPositive(grammar, "R1", P4_CreateLiteral("let", true));
- Parameters
grammar – The grammar.
name – The grammar rule name.
refexpr – The positive pattern to check.
- Returns
The error code.
-
P4_Expression *P4_CreateNegative(P4_Expression *expr)¶
Create a P4_Negative expression.
Example:
// If the following text does not start with "let", the match is successful. P4_Expression* expr = P4_CreateNegative(P4_CreateLiteral("let", true));
- Parameters
expr – The negative pattern to check.
- Returns
A P4_Expression.
-
P4_Error P4_AddNegative(P4_Grammar *grammar, P4_String name, P4_Expression *refexpr)¶
Add a negative expression as grammar rule.
Example:
P4_AddNegative(grammar, "R1", P4_CreateLiteral("let", true));
- Parameters
grammar – The grammar.
name – The grammar rule name.
refexpr – The negative pattern to check.
- Returns
The error code.
-
P4_Expression *P4_CreateCut()¶
Create a P4_Cut expression.
- Returns
A P4_Expression.
-
P4_Expression *P4_CreateLeftRecursion(P4_Expression *lhs, P4_Expression *rhs)¶
Create a P4_LeftRecursion expression.
- Parameters
lhs – Left-hand side of left recursion.
rhs – Right-hand side of left recursion.
- Returns
A P4_Expression.
-
P4_Error P4_AddLeftRecursion(P4_Grammar *grammar, P4_String name, P4_Expression *lhs, P4_Expression *rhs)¶
Add a P4_LeftRecursion expression as grammar rule.
- Parameters
grammar – The grammar.
name – The grammar rule name.
lhs – Left-hand side of left recursion.
rhs – Right-hand side of left recursion.
- Returns
The error code.
-
P4_Expression *P4_CreateSequence(size_t count)¶
Create a P4_Sequence expression.
Example:
P4_Expression* expr = P4_CreateSequence(3);
Note that such an expression is useless as its members are all empty. Please set members using P4_SetMembers.
This function can be useful if you need to add members dynamically.
- Parameters
count – The number of sequence members.
- Returns
A P4_Expression.
-
P4_Expression *P4_CreateSequenceWithMembers(size_t count, ...)¶
Create a P4_Sequence expression.
Example:
// It can match { BODY }. P4_Expression* expr = P4_CreateSequenceWithMembers(3, P4_CreateLiteral("{"), P4_CreateReference(BODY), P4_CreateLiteral("}") );
- Parameters
count – The number of sequence members.
... – The vararg of every sequence member.
- Returns
A P4_Expression.
-
P4_Error P4_AddSequence(P4_Grammar *grammar, P4_String name, size_t count)¶
Add a sequence expression as grammar rule.
Example:
P4_AddSequence(grammar, "R1", 3);
Note that such an expression is useless as its members are all empty. Please set members using P4_SetMembers.
This function can be useful if you need to add members dynamically.
- Parameters
grammar – The grammar.
name – The grammar rule name.
count – The number of sequence members.
- Returns
The error code.
-
P4_Error P4_AddSequenceWithMembers(P4_Grammar *grammar, P4_String name, size_t count, ...)¶
Add a sequence expression as grammar rule.
Example:
P4_AddSequenceWithMembers(grammar, "R1", 3, P4_CreateLiteral("{"), P4_CreateReference(BODY), P4_CreateLiteral("}") );
- Parameters
grammar – The grammar.
name – The grammar rule name.
count – The number of sequence members.
... – The members.
- Returns
The error code.
-
P4_Expression *P4_CreateChoice(size_t count)¶
Create a P4_Choice expression.
Example:
P4_Expression* expr = P4_CreateChoice(3);
Note that such an expression is useless as its members are all empty. Please set members using P4_SetMembers.
This function can be useful if you need to add members dynamically.
- Parameters
count – The number of choice members.
- Returns
A P4_Expression.
-
P4_Expression *P4_CreateChoiceWithMembers(size_t count, ...)¶
Create a P4_Choice expression.
Example:
// It can match whitespace, tab and newline. P4_Expression* expr = P4_CreateChoiceWithMembers(3, P4_CreateLiteral(" "), P4_CreateReference(\t), P4_CreateLiteral("\n") );
- Parameters
count – The number of choice members.
... – The vararg of every choice member.
- Returns
A P4_Expression.
-
P4_Error P4_AddChoice(P4_Grammar *grammar, P4_String name, size_t count)¶
Add a choice expression as grammar rule.
Example:
P4_AddChoice(grammar, "R1", 3);
Note that such an expression is useless as its members are all empty. Please set members using P4_SetMembers.
This function can be useful if you need to add members dynamically.
- Parameters
grammar – The grammar.
name – The grammar rule name.
count – The number of choice members.
- Returns
The error code.
-
P4_Error P4_AddChoiceWithMembers(P4_Grammar *grammar, P4_String name, size_t count, ...)¶
Add a choice expression as grammar rule.
Example:
P4_AddChoiceWithMembers(grammar, "R1", 3, P4_CreateLiteral(" "), P4_CreateReference(\t), P4_CreateLiteral("\n") );
- Parameters
grammar – The grammar.
name – The grammar rule name.
count – The number of choice members.
... – The members.
- Returns
The error code.
-
P4_Expression *P4_CreateRepeatMinMax(P4_Expression *repeat_expr, size_t min, size_t max)¶
Create a P4_Repeat expression minimal min times and maximal max times.
Example:
// It can match string "a", "aa", or "aaa". P4_Expression* expr = P4_CreateRepeatMinMax( P4_CreateLiteral("a", true), 1, 3 );
- Parameters
repeat_expr – The repeated expression.
min – The minimum repeat times.
max – The maximum repeat times.
- Returns
A P4_Expression.
-
P4_Error P4_AddRepeatMinMax(P4_Grammar *grammar, P4_String name, P4_Expression *repeat_expr, size_t min, size_t max)¶
Create a P4_Repeat expression minimal min times and maximal max times.
Example:
// It can match string "a", "aa", or "aaa". P4_AddRepeatMinMax(grammar, "R1", P4_CreateLiteral("a", true), 1, 3 );
- Parameters
grammar – The grammar.
name – The grammar rule name.
repeat_expr – The repeated expression.
min – The minimum repeat times.
max – The maximum repeat times.
- Returns
The error code.
-
P4_Expression *P4_CreateRepeatMin(P4_Expression *repeat_expr, size_t min)¶
Create a P4_Repeat expression minimal min times and maximal SIZE_MAX times.
Example:
// It can match string "a", "aa", "aaa", .... P4_Expression* expr = P4_CreateRepeatMin(P4_CreateLiteral("a", true), 1);
It’s equivalent to P4_CreateRepeatMinMax(expr, min, SIZE_MAX);
- Parameters
repeat_expr – The repeated expression.
min – The minimum repeat times.
- Returns
A P4_Expression.
-
P4_Error P4_AddRepeatMin(P4_Grammar *grammar, P4_String name, P4_Expression *repeat_expr, size_t min)¶
Create a RepeatMin expression as grammar rule.
Example:
// It can match string "a", "aa", "aaa", .... P4_AddRepeatMin(grammar, "R1", P4_CreateLiteral("a", true), 1);
- Parameters
grammar – The grammar.
name – The grammar rule name.
repeat_expr – The repeated expression.
min – The minimum repeat times.
- Returns
The error code.
-
P4_Expression *P4_CreateRepeatMax(P4_Expression *repeat_expr, size_t max)¶
Create a P4_Repeat expression maximal max times.
Example:
// It can match string "", "a", "aa", "aaa". P4_Expression* expr = P4_CreateRepeatMax(P4_CreateLiteral("a", true), 3);
It’s equivalent to P4_CreateRepeatMinMax(expr, 0, max);
- Parameters
repeat_expr – The repeated expression.
max – The maximum repeat times.
- Returns
A P4_Expression.
-
P4_Error P4_AddRepeatMax(P4_Grammar *grammar, P4_String name, P4_Expression *repeat_expr, size_t max)¶
Add a RepeatMax expression as grammar rule.
Example:
// It can match string "", "a", "aa", "aaa". P4_AddRepeatMax(grammar, "name", P4_CreateLiteral("a", true), 3);
- Parameters
grammar – The grammar.
name – The grammar rule name.
repeat_expr – The repeated expression.
max – The maximum repeat times.
- Returns
A P4_Expression.
-
P4_Expression *P4_CreateRepeatExact(P4_Expression *repeat_expr, size_t times)¶
Create a P4_Repeat expression exact N times.
Example:
// It can match string "aaa". P4_Expression* expr = P4_CreateRepeatExact(P4_CreateLiteral("a", true), 3);
It’s equivalent to P4_CreateRepeatMinMax(expr, N, N);
- Parameters
repeat_expr – The repeated expression.
times – The repeat times.
- Returns
A P4_Expression.
-
P4_Error P4_AddRepeatExact(P4_Grammar *grammar, P4_String name, P4_Expression *repeat_expr, size_t times)¶
Add a RepeatExact expression as grammar rule.
Example:
// It can match string "aaa". P4_AddRepeatExact(grammar, "R1", P4_CreateLiteral("a", true), 3);
- Parameters
grammar – The grammar.
name – The grammar rule name.
repeat_expr – The repeated expression.
times – The repeat times.
- Returns
The error code.
-
P4_Expression *P4_CreateZeroOrOnce(P4_Expression *expr)¶
Create a P4_Repeat expression zero or once.
Example:
// It can match string "" or "a". P4_Expression* expr = P4_CreateZeroOrOnce(P4_CreateLiteral("a", true));
It’s equivalent to P4_CreateRepeatMinMax(expr, 0, 1);
- Parameters
expr – The repeated expression.
- Returns
A P4_Expression.
-
P4_Error P4_AddZeroOrOnce(P4_Grammar *grammar, P4_String name, P4_Expression *repeat_expr)¶
Add a ZeroOrOnce expression as grammar rule.
Example:
// It can match string "" or "a". P4_AddZeroOrOnce(grammar, "R1", P4_CreateLiteral("a", true));
It’s equivalent to P4_CreateRepeatMinMax(expr, 0, 1);
- Parameters
grammar – The grammar.
name – The grammar rule name.
repeat_expr – The repeated expression.
- Returns
The error code.
-
P4_Expression *P4_CreateZeroOrMore(P4_Expression *expr)¶
Create a P4_Repeat expression zero or more times.
Example:
// It can match string "" or "a". P4_Expression* expr = P4_CreateZeroOrMore(P4_CreateLiteral("a", true));
It’s equivalent to P4_CreateRepeatMinMax(expr, 0, SIZE_MAX);
- Parameters
expr – The repeated expression.
- Returns
A P4_Expression.
-
P4_Error P4_AddZeroOrMore(P4_Grammar *grammar, P4_String name, P4_Expression *repeat_expr)¶
Add a ZeroOrMore expression as grammar rule.
Example:
// It can match string "" or "a". P4_AddZeroOrMore(grammar, "R1", P4_CreateLiteral("a", true));
- Parameters
grammar – The grammar.
name – The grammar rule name.
repeat_expr – The repeated expression.
- Returns
The error code.
-
P4_Expression *P4_CreateOnceOrMore(P4_Expression *expr)¶
Create a P4_Repeat expression once or more times.
Example:
// It can match string "a", "aa", "aaa", .... P4_Expression* expr = P4_CreateOnceOrMore(P4_CreateLiteral("a", true));
It’s equivalent to P4_CreateRepeatMinMax(expr, 1, SIZE_MAX);
- Parameters
expr – The repeated expression.
- Returns
A P4_Expression.
-
P4_Error P4_AddOnceOrMore(P4_Grammar *grammar, P4_String name, P4_Expression *repeat_expr)¶
Add an OnceOrMore expression as grammar rule.
Example:
// It can match string "a", "aa", "aaa", .... P4_AddOnceOrMore(grammar, "R1", P4_CreateLiteral("a", true));
- Parameters
grammar – The grammar.
name – The grammar rule name.
repeat_expr – The repeated expression.
- Returns
The error code.
-
P4_Expression *P4_CreateBackReference(size_t backref_index, bool sensitive)¶
Create a P4_BackReference expression.
Example:
// It can match string "EOF MULTILINE EOF" or "eof MULTILINE eof", but not "EOF MULTILINE eof". P4_Expression* expr = P4_CreateSequenceWithMembers(4, P4_CreateLiteral("EOF", false), P4_CreateReference(MULTILINE), P4_CreateBackReference(0, true) );
- Parameters
backref_index – The index of backref member in the sequence.
sensitive – Whether the backref matching is case sensitive.
- Returns
A P4_Expression.
-
P4_Error P4_SetMember(P4_Expression *expr, size_t index, P4_Expression *member)¶
Set the member of Sequence/Choice at a given index.
Example:
P4_Error err; P4_Expression* expr = P4_CreateSequence(2); if ((err = P4_SetMember(expr, 0, P4_CreateLiteral("a", true))) != P4_Ok) { // handle error. } if ((err = P4_SetMember(expr, 1, P4_CreateLiteral("b", true))) != P4_Ok) { // handle error. }
- Parameters
expr – The sequence/choice expression.
index – The index of member.
member – The member expression.
- Returns
The error code.
-
P4_Error P4_SetReferenceMember(P4_Expression *expr, size_t index, P4_String name)¶
Set the referenced member of Sequence/Choice at a given index.
It’s equivalent to:
P4_SetMember(expr, index, P4_CreateReference("member_n"));
- Parameters
expr – The sequence/choice expression.
index – The index of member.
name – The reference name of grammar rule for member.
- Returns
The error code.
-
size_t P4_GetMembersCount(P4_Expression *expr)¶
Get the total number members for sequence/choice.
- Parameters
expr – The sequence/choice expression.
- Returns
The number. If something goes wrong, it returns 0.
P4_Expression* expr = P4_CreateSequence(3); size_t count = P4_GetMembers(expr); // 3
-
P4_Expression *P4_GetMember(P4_Expression *expr, size_t index)¶
Get the member of sequence/choice at a given index.
Example:
P4_Expression* member = P4_GetMember(expr, 0);
- Parameters
expr – The sequence/choice expression.
index – The index of a member.
- Returns
The P4_Expression object of a member.
-
P4_Expression *P4_CreateStartOfInput()¶
Create an Start-Of-Input expression.
Example:
P4_Expression* expr = P4_CreateStartOfInput();
Start-Of-Input can be used to insert whitespace before the actual content.
Example:
P4_AddSequenceWithMembers(grammar, "R1", 2, P4_CreateStartOfInput(), P4_CreateLiteral("HELLO", true) );
Start-Of-Input is equivalent to `
P4_CreatePositive(P4_CreateRange(1, 0x10ffff, 1))
.- Returns
A P4_Expression.
-
P4_Expression *P4_CreateEndOfInput()¶
Create an End-Of-Input expression.
Example:
P4_Expression* expr = P4_CreateEndOfInput();
End-Of-Input can be used to insert whitespace after the actual content.
Example:
P4_AddSequenceWithMembers(grammar, "R1", 2, P4_CreateLiteral("HELLO", true), P4_CreateEndOfInput() );
Start-Of-Input is equivalent to `
P4_CreateNegative(P4_CreateRange(1, 0x10ffff, 1))
.- Returns
A P4_Expression.
-
P4_Error P4_AddJoin(P4_Grammar *grammar, P4_String name, const P4_String joiner, P4_String reference)¶
Add a Join expression as grammar rule.
Example:
P4_Expression* row = P4_AddJoin(grammar, "R1", ",", "element");
- Parameters
grammar – The grammar.
name – The grammar rule name.
joiner – A joiner literal.
reference – The repeated pattern rule name.
- Returns
The error code.
-
P4_Expression *P4_CreateJoin(const P4_String joiner, P4_String rule_name)¶
Create a Join expression.
Example:
P4_Expression* row = P4_CreateJoin(",", "element");
Join is a syntax sugar to the below structure:
Sequence(pattern, ZeroOrMore(Sequence(Literal(joiner), pattern)))
- Parameters
joiner – A joiner literal.
rule_name – The repeated pattern rule name.
- Returns
A P4_Expression.
-
void P4_DeleteExpression(P4_Expression *expr)¶
Free an expression.
Example:
P4_Expression* expr = P4_CreateLiteral("a", true); P4_DeleteExpression(expr);
The members of a sequence/choice expression will be deleted. The ref_expr of a positive/negative expression will be deleted.
P4_Reference only hold a reference so the referenced expression won’t be freed.
- Parameters
expr – The expression.
-
bool P4_IsRule(P4_Expression *expr)¶
Check if an expression is a grammar rule.
- Parameters
expr – The expression.
- Returns
if an expression is a grammar rule.
P4_AddLiteral(grammar, "R1", "a", true); P4_IsRule(P4_GetGrammarRule(grammar, 1)); // true
-
bool P4_IsSquashed(P4_Expression *expr)¶
Check if an expression has P4_FLAG_SQUASHED flag.
-
bool P4_IsLifted(P4_Expression *expr)¶
Check if an expression has P4_FLAG_LIFTED flag.
-
bool P4_IsTight(P4_Expression *expr)¶
Check if an expression has P4_FLAG_TIGHT flag.
-
bool P4_IsScoped(P4_Expression *expr)¶
Check if an expression has P4_FLAG_SCOPED flag.
-
bool P4_IsSpaced(P4_Expression *expr)¶
Check if an expression has P4_FLAG_SPACED flag.
-
void P4_SetExpressionFlag(P4_Expression *expr, P4_ExpressionFlag flag)¶
Set the flag for an expression.
Example:
P4_SetExpressionFlag(expr, P4_FLAG_SQUASHED); P4_SetExpressionFlag(expr, P4_FLAG_TIGHT | P4_FLAG_SQUASHED);
- Parameters
expr – The expression.
flag – The flag to set.
-
P4_Grammar *P4_CreateGrammar(void)¶
Create a P4_Grammar object.
Example:
P4_Grammar* grammar = P4_CreateGrammar();
- Returns
A P4_Grammar object.
-
void P4_DeleteGrammar(P4_Grammar *grammar)¶
Delete the grammar object.
It will also free all of the expression rules.
- Parameters
grammar – The grammar.
-
P4_Error P4_AddGrammarRule(P4_Grammar *grammar, P4_String name, P4_Expression *expr)¶
Add a grammar rule.
- Parameters
grammar – The grammar.
name – The grammar rule name.
expr – The grammar rule expression.
- Returns
The error code.
-
void P4_DeleteGrammarRule(P4_Grammar *grammar, const P4_String name)¶
Delete a grammar rule.
WARNING: NOT IMPLEMENTED.
- Parameters
grammar – The grammar.
name – The grammar rule name.
-
P4_Expression *P4_GetGrammarRule(P4_Grammar *grammar, P4_String name)¶
Get a grammar rule by its name.
- Parameters
grammar – The grammar.
name – The grammar rule name.
- Returns
The grammar rule expression. Returns NULL if not found.
P4_AddLiteral(grammar, "a", true); P4_Expression* expr = P4_GetGrammarRule(grammar, "a"); // The literal expression.
-
P4_Error P4_SetGrammarRuleFlag(P4_Grammar *grammar, P4_String name, P4_ExpressionFlag flag)¶
Set the flag of a grammar rule.
Example:
P4_Error err = P4_SetGrammarRuleFlag(grammar, "entry", P4_FLAG_SQUASHED | P4_FLAG_LIFTED | P4_FLAG_TIGHT); if (err != P4_Ok) { printf("err=%u\n", err); exit(1); }
- Parameters
grammar – The grammar.
name – The grammar rule name.
flag – The bits of P4_ExpressionFlag.
- Returns
The error code. If successful, return P4_Ok.
-
P4_Error P4_SetRecursionLimit(P4_Grammar *grammar, size_t limit)¶
Set the maximum allowed recursion calls.
Example:
// on a machine with small memory. P4_SetRecursionLimit(grammar, 256); // on a machine with large memory. P4_SetRecursionLimit(grammar, 1024*20);
- Parameters
grammar – The grammar.
limit – The number of maximum recursion calls.
- Returns
An error. If successful, return P4_Ok.
-
P4_Error P4_SetUserDataFreeFunc(P4_Grammar *grammar, P4_UserDataFreeFunc free_func)¶
Set free function for the user data.
- Parameters
grammar – The grammar.
free_func – The free function.
- Returns
The error code.
-
size_t P4_GetRecursionLimit(P4_Grammar *grammar)¶
Get the maximum allowed recursion calls.
Example:
size_t limit = P4_GetRecursionLimit(grammar); printf("limit=%u\n", limit);
- Parameters
grammar – The grammar.
- Returns
The maximum allowed recursion calls. If something goes wrong, return 0.
-
P4_Source *P4_CreateSource(P4_String content, P4_String entry_name)¶
Create a P4_Source* object.
Example:
char* content = ... // read from some files. P4_Source* source = P4_CreateSource(content, "entry"); // do something... P4_DeleteSource(source);
- Parameters
content – The content of input.
entry_name – The entry grammar rule name.
- Returns
A source object.
-
void P4_DeleteSource(P4_Source *source)¶
Free the allocated memory of a source. If the source has been parsed and has an AST, the entire AST will also be free-ed.
Example:
P4_DeleteSource(source);
- Parameters
source – The source.
-
void P4_ResetSource(P4_Source *source)¶
Reset the source so it can be called with P4_Parse again. All of its internal states are cleared.
Example:
// parse full text. P4_Parse(grammar, source); P4_ResetSource(source); P4_SetSourceSlice(source, 1, 10); // parse text[1:10]. P4_Parse(grammar, source);
- Parameters
source – The source.
-
P4_Error P4_SetSourceSlice(P4_Source *source, size_t start, size_t stop)¶
Set the buf size of the source content. If not set, strlen(source->content) is used.
Example:
P4_String input = "(a)" P4_Source* source = P4_CreateSource(input, "a"); if (P4_Ok != P4_SetSourceSlice(source, 1, 2)) // only parse "a" printf("set buf size error\n");
- Parameters
source – The source.
start – The start position, inclusive.
stop – The stop position, exclusive.
- Returns
The error code.
-
P4_Node *P4_GetSourceAst(P4_Source *source)¶
Get the root node of abstract syntax tree of the source.
- Parameters
source – The source.
- Returns
The root node. If the parse is failed or the root node is lifted, return NULL.
-
P4_Node *P4_AcquireSourceAst(P4_Source *source)¶
Transfer the ownership of source ast to the caller.
You’re on your own now to manage the de-allocation of the source ast.
The source is implicitly reset after the source ast is acquired by the caller.
Example:
P4_Node* root = P4_AcquireSourceAst(source); // ... P4_DeleteNode(grammar, root);
- Parameters
source – The source.
- Returns
The root node. If the parse is failed or the root node is lifted, return NULL.
-
size_t P4_GetSourcePosition(P4_Source *source)¶
Get the last position in the input after a parse.
- Parameters
source – The source.
- Returns
The position in the input.
-
void P4_JsonifySourceAst(FILE *stream, P4_Node *node, P4_Formatter formatter)¶
Print the node tree.
Example:
P4_Node* root = P4_GetSourceAst(source); P4_JsonifySourceAst(stdout, root);
- Parameters
stream – The output stream.
node – The root node of source ast. *param formatter A callback function to format node.
-
P4_Error P4_InspectSourceAst(P4_Node *node, void *userdata, P4_Error (*inspector)(P4_Node*, void*))¶
Inspect the node tree.
Example:
void MyInspector(P4_Node* node, void* userdata) { printf("%lu\t%lu\t%p\n", node->slice.start.pos, P4_GetRuleName(node), userdata); return P4_Ok; } P4_Error err = P4_InspectSourceAst(root, NULL, MyInspector);
- Parameters
node – The root node of source ast.
userdata – Any additional information you want to pass in.
inspector – The inspecting function. It should return P4_Ok for a successful inspection.
-
P4_Error P4_Parse(P4_Grammar *grammar, P4_Source *source)¶
Parse the source given a grammar.
Example:
if ((err = P4_Parse(grammar, source)) != P4_Ok) { // error handling ... }
- Parameters
grammar – The grammar.
source – The source.
- Returns
The error code. If successful, return P4_Ok.
-
bool P4_HasError(P4_Source *source)¶
Determine whether the parse is failed.
Example:
if (P4_HasError(source)) { printf("err=%u\n", P4_GetError(source)); printf("msg=%s\n", P4_GetErrorMessage(source)); }
- Parameters
source – The source.
- Returns
Whether the parse is failed.
-
P4_Error P4_GetError(P4_Source *source)¶
Get the error code if failed to parse the source.
Example:
if (P4_Ok != P4_Parse(grammar, source)) { printf("err=%u\n", P4_GetError(source)); }
- Parameters
source – The source.
- Returns
The error code.
-
P4_String P4_GetErrorString(P4_Error err)¶
Get the error string given an error code.
Example:
P4_String errstr = P4_GetErrorString(P4_MatchError); printf("%s\n", errstr); // "MatchError"
- Parameters
err – The error code.
- Returns
The error string.
-
P4_String P4_GetErrorMessage(P4_Source *source)¶
Get the error message if failed to parse the source.
Example:
if (P4_Ok != P4_Parse(grammar, source)) { printf("msg=%s\n", P4_GetErrorMessage(source)); }
The returned value is a reference to the internal string. Don’t free it after use.
- Parameters
source – The source.
- Returns
The error message.
-
P4_Node *P4_CreateNode(P4_String text, P4_Position *start, P4_Position *stop, P4_String rule)¶
Create a node.
Example:
P4_String str = "Hello world"; size_t start = 0; size_t stop = 11; P4_String rule = "entry" P4_Node* node = P4_CreateNode(text, start, stop, rule); // do something. P4_DeleteNode(grammar, node);
- Parameters
text – The source text.
start – The starting position of the text.
stop – The stopping position of the text.
rule – The name of rule expression that matches to the slice of the text.
- Returns
The node.
-
void P4_DeleteNode(P4_Grammar *grammar, P4_Node *node)¶
Delete the node. This will free the occupied memory for node. The str of the node won’t be free-ed since the node only owns not the string but the slice of a string.
Example:
P4_DeleteNode(grammar, node);
- Parameters
grammar – The grammar.
node – The node.
-
void P4_DeleteNodeChildren(P4_Grammar *grammar, P4_Node *node)¶
Delete the node children. This will free the occupied memory for all node children.
Example:
P4_DeleteNodeChildren(grammar, node);
- Parameters
grammar – The grammar.
node – The node.
-
P4_Slice *P4_GetNodeSlice(P4_Node *node)¶
Get the slice that the node covers. The slice is owned by the node so don’t free it.
Example:
P4_Slice* slice = P4_GetNodeSlice(node); printf("node slice=[%u..%u]\n", slice->i, slice->j);
- Parameters
node – The node.
- Returns
The slice.
-
size_t P4_GetNodeChildrenCount(P4_Node *node)¶
Get the total number of node children.
Example:
P4_Node* root = P4_GetSourceAst(source); size_t count = P4_GetNodeChildrenCount(root); printf("There are %lu children for root node\n", count);
- Parameters
node – The node.
- Returns
The total number of node children.
-
P4_String P4_CopyNodeString(P4_Node *node)¶
Copy the string that the node covers. The caller is responsible for freeing the string.
Example:
P4_String* str = P4_CopyNodeString(node); printf("node str=%s\n", str); free(str);
- Parameters
node – The node.
- Returns
The string.
-
P4_Error P4_SetGrammarCallback(P4_Grammar *grammar, P4_MatchCallback matchcb, P4_ErrorCallback errcb)¶
Set callback function.
- Parameters
grammar – The grammar.
matchcb – The callback on a successful match.
errcb – The callback on a failure match.
- Returns
The error code.
-
P4_Error P4_ReplaceGrammarRule(P4_Grammar *grammar, P4_String name, P4_Expression *expr)¶
Replace an existing grammar rule.
The original grammar rule will be deleted.
- Parameters
grammar – The grammar.
name – The rule name.
expr – The rule expression to replace.
- Returns
The error code.
-
P4_Grammar *P4_CreatePegGrammar()¶
Create a grammar that can parse other grammars written in PEG syntax.
Example:
P4_Grammar* peg = P4_CreatePegGrammar(); P4_DeleteGrammar(peg);
- Returns
The grammar object.
-
P4_Error P4_LoadGrammarResult(P4_String rules, P4_Result *result)¶
Load peg grammar result from a string.
To get the result grammar, if return P4_Ok, use result->grammar.
Example:
P4_Result result = {0}; if (P4_Ok == P4_LoadGrammarResult(RULES, &result)) { P4_Grammar* grammar = result.grammar; } else { printf("%s\n", result.errmsg); }
- Parameters
rules – The rules string.
result – The P4_Result object.
- Returns
The error code.
-
P4_Grammar *P4_LoadGrammar(P4_String rules)¶
Load PEG grammar written in string.
Example:
P4_Grammar* grammar = P4_LoadGrammar( "entry = one one;\n" "one = \"1\";\n" ); P4_Source* source1 = P4_CreateSource("11", "entry"); P4_Parse(grammar, source1); P4_Source* source2 = P4_CreateSource("1", "one"); P4_Parse(grammar, source2);
This function exits the program when an error occurs.
- Parameters
rules – The grammar rules string.
- Returns
The grammar object.
-
P4_ConstString P4_GetRuleName(P4_Expression *expr)¶
Get the rule name.
Example:
P4_String name = P4_GetRuleName(expr);
- Parameters
expr – The rule expression.
- Returns
The rule name.