Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Best way to explore AST to search for specific patterns? #33

Open
ghost opened this issue Apr 16, 2021 · 3 comments
Open

Best way to explore AST to search for specific patterns? #33

ghost opened this issue Apr 16, 2021 · 3 comments
Assignees
Labels
feedback waiting for feedback question Further information is requested

Comments

@ghost
Copy link

ghost commented Apr 16, 2021

Hello,

i am trying to migrate an existing project using kastree to this library.

I am struggling to retrieve the content of a parameter of an annotation that is on a method of a class.

I have a Kotlin file that looks like this (excerpt):

class MyClass {

@KafkaListener(
        id = "\${'$'}{messaging.command.topic.consumer.group.name}",
        clientIdPrefix = "\${'$'}{messaging.command.topic.consumer.group.name}",
        topics = ["direct.topic.name.2", "\${'$'}{messaging.command.topic.name.2}"],
        concurrency = "\${'$'}{messaging.command.topic.listener-count}"
    )
    fun topicTest4MultipleMixedTopics(@Payload entityCommand: EntityCommand<JsonNode>, record: ConsumerRecord<String, Array<Byte>>) {
    }
}

What's the best way to get the content of the topics argument of the @KafkaListener annotation ?

So far i came up with this. This gives me the members of the class:

kotlinFile.summary(attachRawAst = false)
            .onSuccess { ast ->
                ast
                    .filterIsInstance<KlassDeclaration>() // filter on Classes
                    .flatMap { it.flatten("classBody") } // get the Class body
                    .flatMap { it.children } // get all the declarations within that class (functions etc)
                    .filterIsInstance(KlassDeclaration::class.java) // filter on KlassDeclaration
                    .flatMap { parseTopics2(it) } // parse topics from functions
            }

This tries to parse the function declaration block. I am faced with each node having a single node in its children, over and over, and no good way to get the content of the actual string.

private fun parseTopics2(func: KlassDeclaration): List<Pair<String, List<Schema>>> {
        func.children
            .asSequence()
            .filterIsInstance<KlassAnnotation>()
            .filter { it.description.contains(annotation) }
            .mapNotNull { it.arguments.firstOrNull { it.identifier.identifier == "topics" } }
            .mapNotNull { it.expressions.firstOrNull() }
            .filter { it.description == "collectionLiteral" }
            .filterIsInstance<DefaultAstNode>()
            .mapNotNull { it.children.getOrNull(1) }
            .toList()
            .flatMap { it.flatten("stringLiteral") }
        return emptyList()
    }

I am also not sure that doing a .filterIsInstance<KlassAnnotation>() is a good way of filtering the AST, surely there is a better way of doing that, no ?

@drieks
Copy link
Collaborator

drieks commented Apr 17, 2021

Hi @gauthier-roebroeck-mox,
KlassDeclaration has a Member called annotations. When you have the func: KlassDeclaration you can write func.annotations to get a list of all annotations. You can then compare the identifier to check if the given name of the annotation is KafkaListener. When this is the case, you can write

anno.arguments.find { argument ->
  argument.identifier.identifierName() == "topics"
}

(import kotlinx.ast.common.klass.identifierName)

Reading the value is not so easy because currently only parsing of top level stuff is implemented. When you have the argument, you can use children to read the value. in this case, there is only one children with descrption "collectionLiteral" . this has again six children:

0 = {DefaultAstTerminal@4161} DefaultAstTerminal(description=LSQUARE, text=[, channel=AstChannel(id=0, name=DEFAULT_TOKEN_CHANNEL), attachments=AstAttachments(attachments={kotlinx.ast.common.ast.AstAttachmentAstInfo@4944de48=   65 [203..204]   [5:18..5:19]}))
1 = {DefaultAstNode@4162} DefaultAstNode(description=expression, children=[DefaultAstNode(description=disjunction, children=[DefaultAstNode(description=conjunction, children=[DefaultAstNode(description=equality, children=[DefaultAstNode(description=comparison, children=[DefaultAstNode(description=genericCallLikeComparison, children=[DefaultAstNode(description=infixOperation, children=[DefaultAstNode(description=elvisExpression, children=[DefaultAstNode(description=infixFunctionCall, children=[DefaultAstNode(description=rangeExpression, children=[DefaultAstNode(description=additiveExpression, children=[DefaultAstNode(description=multiplicativeExpression, children=[DefaultAstNode(description=asExpression, children=[DefaultAstNode(description=prefixUnaryExpression, children=[DefaultAstNode(description=postfixUnaryExpression, children=[DefaultAstNode(description=primaryExpression, children=[DefaultAstNode(description=stringLiteral, children=[DefaultAstNode(description=lineStringLiteral, children=[DefaultAstTerminal
2 = {DefaultAstTerminal@4163} DefaultAstTerminal(description=COMMA, text=,, channel=AstChannel(id=0, name=DEFAULT_TOKEN_CHANNEL), attachments=AstAttachments(attachments={kotlinx.ast.common.ast.AstAttachmentAstInfo@4944de48=   69 [225..226]   [5:40..5:41]}))
3 = {DefaultAstTerminal@4164} DefaultAstTerminal(description=Inside_WS, text= , channel=AstChannel(id=1, name=HIDDEN), attachments=AstAttachments(attachments={kotlinx.ast.common.ast.AstAttachmentAstInfo@4944de48=   70 [226..227]   [5:41..5:42]}))
4 = {DefaultAstNode@4165} DefaultAstNode(description=expression, children=[DefaultAstNode(description=disjunction, children=[DefaultAstNode(description=conjunction, children=[DefaultAstNode(description=equality, children=[DefaultAstNode(description=comparison, children=[DefaultAstNode(description=genericCallLikeComparison, children=[DefaultAstNode(description=infixOperation, children=[DefaultAstNode(description=elvisExpression, children=[DefaultAstNode(description=infixFunctionCall, children=[DefaultAstNode(description=rangeExpression, children=[DefaultAstNode(description=additiveExpression, children=[DefaultAstNode(description=multiplicativeExpression, children=[DefaultAstNode(description=asExpression, children=[DefaultAstNode(description=prefixUnaryExpression, children=[DefaultAstNode(description=postfixUnaryExpression, children=[DefaultAstNode(description=primaryExpression, children=[DefaultAstNode(description=stringLiteral, children=[DefaultAstNode(description=lineStringLiteral, children=[DefaultAstTerminal
5 = {DefaultAstTerminal@4166} DefaultAstTerminal(description=RSQUARE, text=], channel=AstChannel(id=0, name=DEFAULT_TOKEN_CHANNEL), attachments=AstAttachments(attachments={kotlinx.ast.common.ast.AstAttachmentAstInfo@4944de48=   77 [268..269]   [5:83..5:84]}))

you can try to call summary again on all children to get an easier to use ast.

please let me know if you have any questions.

@ghost
Copy link
Author

ghost commented Apr 21, 2021

The best I could come up with is this, not very pretty:

func.annotations
            .mapNotNull { annotation -> annotation.arguments.firstOrNull { it.identifier?.identifier == "topics" } }
            .mapNotNull { it.expressions.firstOrNull() }
            .flatMap { it.flatten("lineStringContent") }
            .flatMap { it.children }
            .filter { it.description == "LineStrText" }
            .filterIsInstance<DefaultAstTerminal>()
            .map { it.text }

Is it possible to use something in kotlinx.ast.common.filter to make it better ? I couldn't find any doc on those TreeFilter.

drieks added a commit that referenced this issue Apr 22, 2021
@drieks
Copy link
Collaborator

drieks commented Apr 22, 2021

Hi @gauthier-roebroeck-mox,

yes, sadly I have only little time to work on this library and I prefer to add new functionality, so there is almost no documentation. TreeFilter is mainly used as an internal API by TreeMapper. This is something like map/flatMap on a AST-Structure. If you are interested in an example:

// annotation
// : (singleAnnotation | multiAnnotation) NL*
// ;
.convert(
filter = byDescription("annotation")
) { node: AstNode ->
recursiveFlatten(node).flatMap { result ->
astContinue(
KlassAnnotation(
identifier = result.filterIsInstance<KlassIdentifier>(),
arguments = result.filterIsInstance<KlassDeclaration>()
)
)
}
}

This Code will convert the annotation from your example, parsed into this ast:
annotation
singleAnnotation
AT_PRE_WS >>> @<<< (DEFAULT_TOKEN_CHANNEL)
unescapedAnnotation
constructorInvocation
userType
simpleUserType
simpleIdentifier
Identifier >>>KafkaListener<<< (DEFAULT_TOKEN_CHANNEL)
valueArguments
LPAREN >>>(<<< (DEFAULT_TOKEN_CHANNEL)
Inside_NL >>>\n<<< (HIDDEN)
Inside_WS >>> <<< (HIDDEN)
Inside_WS >>> <<< (HIDDEN)
Inside_WS >>> <<< (HIDDEN)
Inside_WS >>> <<< (HIDDEN)
Inside_WS >>> <<< (HIDDEN)
Inside_WS >>> <<< (HIDDEN)
Inside_WS >>> <<< (HIDDEN)
Inside_WS >>> <<< (HIDDEN)
valueArgument
simpleIdentifier
Identifier >>>id<<< (DEFAULT_TOKEN_CHANNEL)
Inside_WS >>> <<< (HIDDEN)
ASSIGNMENT >>>=<<< (DEFAULT_TOKEN_CHANNEL)
Inside_WS >>> <<< (HIDDEN)
expression
disjunction
conjunction
equality
comparison
genericCallLikeComparison
infixOperation
elvisExpression
infixFunctionCall
rangeExpression
additiveExpression
multiplicativeExpression
asExpression
prefixUnaryExpression
postfixUnaryExpression
primaryExpression
stringLiteral
lineStringLiteral
QUOTE_OPEN >>>"<<< (DEFAULT_TOKEN_CHANNEL)
lineStringContent
LineStrEscapedChar >>>\$<<< (DEFAULT_TOKEN_CHANNEL)
lineStringContent
LineStrText >>>{'<<< (DEFAULT_TOKEN_CHANNEL)
lineStringContent
LineStrText >>>$<<< (DEFAULT_TOKEN_CHANNEL)
lineStringContent
LineStrText >>>'}{messaging.command.topic.consumer.group.name}<<< (DEFAULT_TOKEN_CHANNEL)
QUOTE_CLOSE >>>"<<< (DEFAULT_TOKEN_CHANNEL)
COMMA >>>,<<< (DEFAULT_TOKEN_CHANNEL)
Inside_NL >>>\n<<< (HIDDEN)
Inside_WS >>> <<< (HIDDEN)
Inside_WS >>> <<< (HIDDEN)
Inside_WS >>> <<< (HIDDEN)
Inside_WS >>> <<< (HIDDEN)
Inside_WS >>> <<< (HIDDEN)
Inside_WS >>> <<< (HIDDEN)
Inside_WS >>> <<< (HIDDEN)
Inside_WS >>> <<< (HIDDEN)
valueArgument
simpleIdentifier
Identifier >>>clientIdPrefix<<< (DEFAULT_TOKEN_CHANNEL)
Inside_WS >>> <<< (HIDDEN)
ASSIGNMENT >>>=<<< (DEFAULT_TOKEN_CHANNEL)
Inside_WS >>> <<< (HIDDEN)
expression
disjunction
conjunction
equality
comparison
genericCallLikeComparison
infixOperation
elvisExpression
infixFunctionCall
rangeExpression
additiveExpression
multiplicativeExpression
asExpression
prefixUnaryExpression
postfixUnaryExpression
primaryExpression
stringLiteral
lineStringLiteral
QUOTE_OPEN >>>"<<< (DEFAULT_TOKEN_CHANNEL)
lineStringContent
LineStrEscapedChar >>>\$<<< (DEFAULT_TOKEN_CHANNEL)
lineStringContent
LineStrText >>>{'<<< (DEFAULT_TOKEN_CHANNEL)
lineStringContent
LineStrText >>>$<<< (DEFAULT_TOKEN_CHANNEL)
lineStringContent
LineStrText >>>'}{messaging.command.topic.consumer.group.name}<<< (DEFAULT_TOKEN_CHANNEL)
QUOTE_CLOSE >>>"<<< (DEFAULT_TOKEN_CHANNEL)
COMMA >>>,<<< (DEFAULT_TOKEN_CHANNEL)
Inside_NL >>>\n<<< (HIDDEN)
Inside_WS >>> <<< (HIDDEN)
Inside_WS >>> <<< (HIDDEN)
Inside_WS >>> <<< (HIDDEN)
Inside_WS >>> <<< (HIDDEN)
Inside_WS >>> <<< (HIDDEN)
Inside_WS >>> <<< (HIDDEN)
Inside_WS >>> <<< (HIDDEN)
Inside_WS >>> <<< (HIDDEN)
valueArgument
simpleIdentifier
Identifier >>>topics<<< (DEFAULT_TOKEN_CHANNEL)
Inside_WS >>> <<< (HIDDEN)
ASSIGNMENT >>>=<<< (DEFAULT_TOKEN_CHANNEL)
Inside_WS >>> <<< (HIDDEN)
expression
disjunction
conjunction
equality
comparison
genericCallLikeComparison
infixOperation
elvisExpression
infixFunctionCall
rangeExpression
additiveExpression
multiplicativeExpression
asExpression
prefixUnaryExpression
postfixUnaryExpression
primaryExpression
collectionLiteral
LSQUARE >>>[<<< (DEFAULT_TOKEN_CHANNEL)
expression
disjunction
conjunction
equality
comparison
genericCallLikeComparison
infixOperation
elvisExpression
infixFunctionCall
rangeExpression
additiveExpression
multiplicativeExpression
asExpression
prefixUnaryExpression
postfixUnaryExpression
primaryExpression
stringLiteral
lineStringLiteral
QUOTE_OPEN >>>"<<< (DEFAULT_TOKEN_CHANNEL)
lineStringContent
LineStrText >>>direct.topic.name.2<<< (DEFAULT_TOKEN_CHANNEL)
QUOTE_CLOSE >>>"<<< (DEFAULT_TOKEN_CHANNEL)
COMMA >>>,<<< (DEFAULT_TOKEN_CHANNEL)
Inside_WS >>> <<< (HIDDEN)
expression
disjunction
conjunction
equality
comparison
genericCallLikeComparison
infixOperation
elvisExpression
infixFunctionCall
rangeExpression
additiveExpression
multiplicativeExpression
asExpression
prefixUnaryExpression
postfixUnaryExpression
primaryExpression
stringLiteral
lineStringLiteral
QUOTE_OPEN >>>"<<< (DEFAULT_TOKEN_CHANNEL)
lineStringContent
LineStrEscapedChar >>>\$<<< (DEFAULT_TOKEN_CHANNEL)
lineStringContent
LineStrText >>>{'<<< (DEFAULT_TOKEN_CHANNEL)
lineStringContent
LineStrText >>>$<<< (DEFAULT_TOKEN_CHANNEL)
lineStringContent
LineStrText >>>'}{messaging.command.topic.name.2}<<< (DEFAULT_TOKEN_CHANNEL)
QUOTE_CLOSE >>>"<<< (DEFAULT_TOKEN_CHANNEL)
RSQUARE >>>]<<< (DEFAULT_TOKEN_CHANNEL)
COMMA >>>,<<< (DEFAULT_TOKEN_CHANNEL)
Inside_NL >>>\n<<< (HIDDEN)
Inside_WS >>> <<< (HIDDEN)
Inside_WS >>> <<< (HIDDEN)
Inside_WS >>> <<< (HIDDEN)
Inside_WS >>> <<< (HIDDEN)
Inside_WS >>> <<< (HIDDEN)
Inside_WS >>> <<< (HIDDEN)
Inside_WS >>> <<< (HIDDEN)
Inside_WS >>> <<< (HIDDEN)
valueArgument
simpleIdentifier
Identifier >>>concurrency<<< (DEFAULT_TOKEN_CHANNEL)
Inside_WS >>> <<< (HIDDEN)
ASSIGNMENT >>>=<<< (DEFAULT_TOKEN_CHANNEL)
Inside_WS >>> <<< (HIDDEN)
expression
disjunction
conjunction
equality
comparison
genericCallLikeComparison
infixOperation
elvisExpression
infixFunctionCall
rangeExpression
additiveExpression
multiplicativeExpression
asExpression
prefixUnaryExpression
postfixUnaryExpression
primaryExpression
stringLiteral
lineStringLiteral
QUOTE_OPEN >>>"<<< (DEFAULT_TOKEN_CHANNEL)
lineStringContent
LineStrEscapedChar >>>\$<<< (DEFAULT_TOKEN_CHANNEL)
lineStringContent
LineStrText >>>{'<<< (DEFAULT_TOKEN_CHANNEL)
lineStringContent
LineStrText >>>$<<< (DEFAULT_TOKEN_CHANNEL)
lineStringContent
LineStrText >>>'}{messaging.command.topic.listener-count}<<< (DEFAULT_TOKEN_CHANNEL)
QUOTE_CLOSE >>>"<<< (DEFAULT_TOKEN_CHANNEL)
Inside_NL >>>\n<<< (HIDDEN)
Inside_WS >>> <<< (HIDDEN)
Inside_WS >>> <<< (HIDDEN)
Inside_WS >>> <<< (HIDDEN)
Inside_WS >>> <<< (HIDDEN)
RPAREN >>>)<<< (DEFAULT_TOKEN_CHANNEL)

into this summary:
KlassAnnotation(KafkaListener)
KlassDeclaration(argument id)
KlassString
Escape("\$")
"{'"
"$"
"'}{messaging.command.topic.consumer.group.name}"
KlassDeclaration(argument clientIdPrefix)
KlassString
Escape("\$")
"{'"
"$"
"'}{messaging.command.topic.consumer.group.name}"

   // this call will add a new defintion to the TreeMapper
    .convert(
        // filter byDescription will select all AST Nodes (and also terminal symbols) with the given description
        filter = byDescription("annotation")
   // select only AstNodes (a node with possible child nodes), no AstTerminal (a leaf node containing text)
    ) { node: AstNode ->
        // use the same TreeMapper to convert all children of this node
        recursiveFlatten(node).flatMap { result ->
            // continue the tree mapping,
            astContinue(
                // replace the node `node` with this KlassAnnotation
                KlassAnnotation(
                    // result the summary of all children nodes,
                    // filter out all KlassIdentifer
                    identifier = result.filterIsInstance<KlassIdentifier>(),
                    // ...and all KlassDeclarations
                    arguments = result.filterIsInstance<KlassDeclaration>()
                )
            )
        }
    }

I hope this helps, feel free to ask me any questions. Sadly, I have no time to add functionality to work in a better way with the ast nodes. In my private project (for which I'm developing this library here) I'm doing code generation (using https://github.com/square/kotlinpoet for writing generated code) and kotlinx.ast for parsing the source. The first step after kotlinx.ast is filtering the annotations I'm interested in (and the annotated subjects), I'm storing this information in a set of data classes. This is very similar to the code you provided here. In the last step, I'm converting this data class into generated code using kotlinpoet.

@drieks drieks self-assigned this May 11, 2021
@drieks drieks added feedback waiting for feedback question Further information is requested labels May 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feedback waiting for feedback question Further information is requested
Projects
None yet
Development

No branches or pull requests

1 participant