Skip to content

Commit 75d8e14

Browse files
fangfang1984fangshen
andauthored
New high performance attributes implement for cppast.net (#80)
* Enhance for function template & CppType FullName support. 1. add support for function template 2. add inline namespace support 3. change FullName as a CppType property, and deduce the CppClass full name with specialized template right, you can just use CppClass full name in a generated c++ codes now. 4. FindByFullName() now can search auto ignore inline namespace now(such as clang std::__1::vector, now you can just use std::vector to search) 5. fix crash when the CppClass has a specialized template with PartialSpecializedTemplateDecl 6. fix crash when typedef with a AliasTemplateDecl for Underlying type. * 1. Add a test for partial specialized template && 2. Add a IsSpecializedArgument for CppTemplateArgument, so we can detect is a specialized argument or not. * --other=1. move Tokenizer to CppTokenUtil class && 2. add a sperate list for token parser attributes and mark it as deprecated * --other= 1. add a document for new attribute, in location doc/attributes.md && 2. new attribute support by __cppast() && 3. default system attribute support not use token parser --------- Co-authored-by: fangshen <fangshen@tencent.com>
1 parent 5dead77 commit 75d8e14

23 files changed

+2116
-1045
lines changed

doc/attributes.md

Lines changed: 154 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,154 @@
1+
## 1. `cppast.net 0.12` Support for `attributes`
2+
The original support of `cppast.net` for various types of `attributes`, including the `meta attribute` of `c++17`, is restricted due to the limitation of `libclang` itself `Api`. We need to rely on token-level parsing to implement related functions. In the implementation of `cppast.net 0.12` and previous versions, we used parsing `token` to implement related functions. Even some `attributes` that `libclang` supports well, such as `dllexport`, `dllimport`, etc., `cppast.net` most of the time also uses token parsing. Although this approach is flexible and we can always try to parse the related `attributes` from the `token` level, it also brings some problems and restrictions, including:
3+
1. `ParseAttributes()` is extremely time-consuming, which led to the addition of the `ParseAttributes` parameter in later versions to control whether to parse `attributes`. However, in some cases, we need to rely on `attributes` to complete the related functions, which is obviously inconvenient.
4+
2. There are defects in the parsing of `meta attribute` - `[[]]`. For `meta attribute` defined above `Function` and `Field`, it is obviously legal at the semantic level, but `cppast.net` does not support this type of `meta attribute` defined above the object very well (there are some exceptions here, like `namespace`, `class`, `enum` these `attribute` declarations, the attribute definition itself cannot be at the top, the compiler will report an error directly for the related usage, it can only be after the related keywords, such as `class [[deprecated]] Abc{};`).
5+
3. Individual parameters of `meta attribute` use macros. Because our original implementation is based on `token` parsing, macros during compilation obviously cannot be correctly handled in this case.
6+
7+
---
8+
## 2. A brief introduction to cases where `attribute` is needed
9+
10+
---
11+
### 2.1 System-level `attribute`
12+
Taking the code segment in the `cppast.net` test case as an example:
13+
```cpp
14+
#ifdef WIN32
15+
#define EXPORT_API __declspec(dllexport)
16+
#else
17+
#define EXPORT_API __attribute__((visibility(""default"")))
18+
#endif
19+
```
20+
For `attributes` like `dllexport` and `visibility` that control interface visibility, we definitely use them more often, not only when `ParseAttributes` is turned on to make it work. We need to provide a high-performance solution for these basic system attributes, and the implementation should not be affected by the switch.
21+
22+
---
23+
### 2.2 Injection of Additional Information by Export Tools and Other Tools
24+
&emsp;&emsp;Let's take the following class definition as an example:
25+
```cpp
26+
#if !defined(__cppast)
27+
#define __cppast(...)
28+
#endif
29+
30+
struct __cppast(msgid = 1) TestPBMessage {
31+
public:
32+
__cppast(id = 1)
33+
float x;
34+
__cppast(id = 2)
35+
double y;
36+
__cppast(id = 3)
37+
uint64_t z;
38+
};
39+
```
40+
To better support serialization and deserialization of `TestPBMessage`, and to have a certain degree of fault tolerance, we have added some additional information based on the original struct definition:
41+
1. The msgid of `TestPBMessage`, here it is directly specified as integer `1`.
42+
2. The `id` of x, y, and z, here directly using `1`, `2`, and `3` respectively.
43+
This way, if we use `cppast.net` to create our offline processing tools, we definitely need to conveniently read out the various 'meta attributes' injected by `__cppast()` which do not directly impact the original code compilation in the tool, and use them appropriately. However, the performance of this part in `cppast.net 0.12` and previous versions is rather poor and has limitations. For example, it can't support cases like the one above where the `attribute` is directly defined on the `Field`.
44+
45+
---
46+
## 3. New Implementation and Adjustment
47+
&emsp;&emsp;The new implementation is mainly based on the limitations of the current implementation mentioned earlier, and the various application scenarios mentioned in the previous chapter. We have re-categorized the `attribute` into three types:
48+
1. `AttributeKind.CxxSystemAttribute` - It corresponds to various system `attributes` that `libclang` itself can parse very well, such as `visibility` mentioned above, as well as `[[deprecated]]`, `[[noreturn]]`, etc. With the help of `ClangSharp`, we can efficiently parse and handle them, so there is no need to worry about switch issues.
49+
2. `AttributeKind.TokenAttribute` - As the name suggests, this corresponds to the `attribute` in the original version of `cppast.net`. It has been marked as `deprecated`, but the parsing of `token` is always a fallback implementation mechanism. We will keep the relevant `Tokenizer` code and use them cautiously to implement some complex features when `ClangSharp` is unable to implement related functions.
50+
3. `AttributeKind.AnnotateAttribute` - This is used to replace the original `meta attribute` implemented based on `token` parsing, aiming to inject methods for classes and members with high performance and low restrictions as introduced earlier.
51+
52+
Next, we will briefly introduce the implementation ideas and usage of various types of `attributes`.
53+
54+
---
55+
### 3.1 `AttributeKind.CxxSystemAttribute`
56+
&emsp;&emsp;We added a function to handle various `attributes` that `ClangSharp` itself supports:
57+
```cs
58+
private List<CppAttribute> ParseSystemAndAnnotateAttributeInCursor(CXCursor cursor)
59+
{
60+
List<CppAttribute> collectAttributes = new List<CppAttribute>();
61+
cursor.VisitChildren((argCursor, parentCursor, clientData) =>
62+
{
63+
var sourceSpan = new CppSourceSpan(GetSourceLocation(argCursor.SourceRange.Start), GetSourceLocation(argCursor.SourceRange.End));
64+
var meta = argCursor.Spelling.CString;
65+
switch (argCursor.Kind)
66+
{
67+
case CXCursorKind.CXCursor_VisibilityAttr:
68+
//...
69+
break;
70+
case CXCursorKind.CXCursor_AnnotateAttr:
71+
//...
72+
break;
73+
case CXCursorKind.CXCursor_AlignedAttr:
74+
//...
75+
break;
76+
//...
77+
default:
78+
break;
79+
}
80+
81+
return CXChildVisitResult.CXChildVisit_Continue;
82+
83+
}, new CXClientData((IntPtr)0));
84+
return collectAttributes;
85+
}
86+
```
87+
With the existing features of `ClangSharp`, such as `visibility attribute`, can be efficiently handled here. Note that here the use of `AnnotateAttr` and `meta attribute` will be introduced. It is also the key to our high-performance `meta attribute` usage. We can directly access the relevant `cursor` on `libclang`'s `AST`, thus avoiding handling related data at the high performance-cost `token` level.
88+
89+
---
90+
### 3.2 `AttributeKind.TokenAttribute`
91+
&emsp;&emsp;For the original `attribute` implemented based on `token` parsing, for compatibility with older versions, it has temporarily been moved from the original `Attributes` property to the `TokenAttributes` property. The new `CxxSystemAttribute` and `AnnotateAttribute` are stored in the original `Attributes` property. You can refer to the relevant test cases to understand their specific usage.
92+
93+
---
94+
### 3.3 `AttributeKind.AnnotateAttribute`
95+
&emsp;&emsp;We need a mechanism to implement `meta attribute` that bypasses `token` parsing. Here we cleverly use the `annotate`
96+
97+
attribute to achieve this. From the several new built-in macros, we can see how it works:
98+
```cs
99+
//Add a default macro here for CppAst.Net
100+
Defines = new List<string>() {
101+
"__cppast_run__", //Help us for identify the CppAst.Net handler
102+
@"__cppast_impl(...)=__attribute__((annotate(#__VA_ARGS__)))", //Help us for use annotate attribute convenience
103+
@"__cppast(...)=__cppast_impl(__VA_ARGS__)", //Add a macro wrapper here, so the argument with macro can be handle right for compiler.
104+
};
105+
```
106+
> [!note]
107+
> These three system macros will not be parsed into `CppMacro` and added to the final parsing result to avoid polluting the output.
108+
109+
In the end, we simply convert the variable argument `__VA_ARGS__` to a string and use `__attribute__((annotate(???)))` to inject information. Thus, if we, like the test code, add the following at the right place:
110+
```cpp
111+
#if !defined(__cppast)
112+
#define __cppast(...)
113+
#endif
114+
```
115+
When the code is parsed by `cppast.net`, the relevant input will be correctly identified and read as an `annotate attribute`. In non-`cppast.net` scenarios, the data injected in `__cppast()` will be correctly ignored to avoid interfering with the actual compilation and execution of the code. In this way, we indirectly achieve the purpose of injecting and reading `meta attribute`.
116+
117+
For the macro case:
118+
```cpp
119+
#if !defined(__cppast)
120+
#define __cppast(...)
121+
#endif
122+
123+
#define UUID() 12345
124+
125+
__cppast(id=UUID(), desc=""a function with macro"")
126+
void TestFunc()
127+
{
128+
}
129+
```
130+
Relevant test code:
131+
```cs
132+
//annotate attribute support on namespace
133+
var func = compilation.Functions[0];
134+
Assert.AreEqual(1, func.Attributes.Count);
135+
Assert.AreEqual(func.Attributes[0].Kind, AttributeKind.AnnotateAttribute);
136+
Assert.AreEqual(func.Attributes[0].Arguments, "id=12345, desc=\"a function with macro\"");
137+
```
138+
Because we did a `wrapper` packaging when defining `__cppast`, we find that macros also work well in `meta attribute` state.
139+
140+
As for the case of `outline attribute`, like `Function`, `Field`, it can support well, and even you can define multiple `attributes` on an object, which is also legal:
141+
```cpp
142+
__cppast(id = 1)
143+
__cppast(name = "x")
144+
__cppast(desc = "???")
145+
float x;
146+
```
147+
148+
---
149+
## 4. Conclusion
150+
&emsp;&emsp;This article mainly introduces the new `attributes` supported by `cppast.net`, which are mainly divided into three categories:
151+
1. CxxSystemAttribute
152+
2. TokenAttribute
153+
3. AnnotateAttribute
154+
We recommend using `CxxSystemAttribute` and `AnnotateAttribute`, which do not require switch control. The existence of `TokenAttribute` is mainly for compatibility with old implementations. The related attributes have been moved into a separate `TokenAttributes` to distinguish from the first two. And `CppParserOptions` corresponding switch is adjusted to `ParseTokenAttributes`. Due to performance and usage limitations, it is not recommended to continue to use it.
Lines changed: 142 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,142 @@
1+
// Copyright (c) Alexandre Mutel. All rights reserved.
2+
// Licensed under the BSD-Clause 2 license.
3+
// See license.txt file in the project root for full license information.
4+
5+
using System;
6+
using NUnit.Framework;
7+
8+
namespace CppAst.Tests
9+
{
10+
public class TestAnnotateAttributes : InlineTestBase
11+
{
12+
[Test]
13+
public void TestAnnotateAttribute()
14+
{
15+
var text = @"
16+
17+
#if !defined(__cppast)
18+
#define __cppast(...)
19+
#endif
20+
21+
__cppast(script, is_browsable=true, desc=""a function"")
22+
void TestFunc()
23+
{
24+
}
25+
26+
enum class __cppast(script, is_browsable=true, desc=""a enum"") TestEnum
27+
{
28+
};
29+
30+
class __cppast(script, is_browsable=true, desc=""a class"") TestClass
31+
{
32+
public:
33+
__cppast(desc=""a member function"")
34+
void TestMemberFunc();
35+
36+
__cppast(desc=""a member field"")
37+
int X;
38+
};
39+
";
40+
41+
ParseAssert(text,
42+
compilation =>
43+
{
44+
Assert.False(compilation.HasErrors);
45+
46+
//annotate attribute support on global function
47+
var cppFunc = compilation.Functions[0];
48+
Assert.AreEqual(1, cppFunc.Attributes.Count);
49+
Assert.AreEqual(cppFunc.Attributes[0].Kind, AttributeKind.AnnotateAttribute);
50+
Assert.AreEqual(cppFunc.Attributes[0].Arguments, "script, is_browsable=true, desc=\"a function\"");
51+
52+
//annotate attribute support on enum
53+
var cppEnum = compilation.Enums[0];
54+
Assert.AreEqual(1, cppEnum.Attributes.Count);
55+
Assert.AreEqual(cppEnum.Attributes[0].Kind, AttributeKind.AnnotateAttribute);
56+
Assert.AreEqual(cppEnum.Attributes[0].Arguments, "script, is_browsable=true, desc=\"a enum\"");
57+
58+
//annotate attribute support on class
59+
var cppClass = compilation.Classes[0];
60+
Assert.AreEqual(1, cppClass.Attributes.Count);
61+
Assert.AreEqual(cppClass.Attributes[0].Kind, AttributeKind.AnnotateAttribute);
62+
Assert.AreEqual(cppClass.Attributes[0].Arguments, "script, is_browsable=true, desc=\"a class\"");
63+
64+
Assert.AreEqual(1, cppClass.Functions.Count);
65+
var memFunc = cppClass.Functions[0];
66+
Assert.AreEqual(1, memFunc.Attributes.Count);
67+
Assert.AreEqual(memFunc.Attributes[0].Arguments, "desc=\"a member function\"");
68+
69+
70+
Assert.AreEqual(1, cppClass.Fields.Count);
71+
var memField = cppClass.Fields[0];
72+
Assert.AreEqual(1, memField.Attributes.Count);
73+
Assert.AreEqual(memField.Attributes[0].Arguments, "desc=\"a member field\"");
74+
}
75+
);
76+
}
77+
78+
79+
[Test]
80+
public void TestAnnotateAttributeInNamespace()
81+
{
82+
var text = @"
83+
84+
#if !defined(__cppast)
85+
#define __cppast(...)
86+
#endif
87+
88+
namespace __cppast(script, is_browsable=true, desc=""a namespace test"") TestNs{
89+
90+
}
91+
92+
";
93+
94+
ParseAssert(text,
95+
compilation =>
96+
{
97+
Assert.False(compilation.HasErrors);
98+
99+
//annotate attribute support on namespace
100+
var ns = compilation.Namespaces[0];
101+
Assert.AreEqual(1, ns.Attributes.Count);
102+
Assert.AreEqual(ns.Attributes[0].Kind, AttributeKind.AnnotateAttribute);
103+
Assert.AreEqual(ns.Attributes[0].Arguments, "script, is_browsable=true, desc=\"a namespace test\"");
104+
105+
}
106+
);
107+
}
108+
109+
[Test]
110+
public void TestAnnotateAttributeWithMacro()
111+
{
112+
var text = @"
113+
114+
#if !defined(__cppast)
115+
#define __cppast(...)
116+
#endif
117+
118+
#define UUID() 12345
119+
120+
__cppast(id=UUID(), desc=""a function with macro"")
121+
void TestFunc()
122+
{
123+
}
124+
125+
";
126+
127+
ParseAssert(text,
128+
compilation =>
129+
{
130+
Assert.False(compilation.HasErrors);
131+
132+
//annotate attribute support on namespace
133+
var func = compilation.Functions[0];
134+
Assert.AreEqual(1, func.Attributes.Count);
135+
Assert.AreEqual(func.Attributes[0].Kind, AttributeKind.AnnotateAttribute);
136+
Assert.AreEqual(func.Attributes[0].Arguments, "id=12345, desc=\"a function with macro\"");
137+
138+
}
139+
);
140+
}
141+
}
142+
}

0 commit comments

Comments
 (0)