GitHub - seinshah/flattenhtml: HTML document flattener package for Go

flattenthtml is a Go package that helps you access to specific nodes in a HTML document directly without a need for traversing all nodes.

Installation

go get github.com/seinshah/flattenhtml

Overview

Use built-in or custom flatteners to access HTML document nodes directly using your desired selectors. Whether you want to access all div nodes (based on the tag name) or all elements with class attributes, or all elements with class value as container, and so on.

flattenhtml currently supports the following flatteners out of the box:

TagFlattener: flattens all nodes based on their tag name.

You can build a custom in-house flattener by implementing *flattenhtml.Flattener interface. If your implementation is generic and can be used by others, please consider contributing it to this package.

Usage

package main

import (
    "fmt"
    "log"
    "strings"

    "github.com/seinshah/flattenhtml"
)

func main() {
    // HTML document to be flattened.
    html := `
        <html>
            <head>
                <title>flattenhtml</title>
            </head>
            <body>
                <div class="container" id="target">
                    <div class="row">
                        <div class="col-md-6">
                            <h1>flattenhtml</h1>
                            <p>flattens HTML documents</p>
                        </div>
                        <div class="col-md-6">
                            <h1>flattenhtml</h1>
                            <p>flattens HTML documents</p>
                        </div>
                    </div>
                </div>
            </body>
        </html>
    `

    nm, err := flattenhtml.NewNodeManagerFromReader(strings.NewReader(html))
    if err != nil {
        log.Fatal(err)
    }

    mc, err := nm.Parse(flattenhtml.NewTagFlattener())
    if err != nil {
        log.Fatal(err)
    }

    tf, err := mc.SelectFlattener(&flattenhtml.TagFlattener{})
    if err != nil {
        log.Fatal(err)
    }

    divs := tf.SelectNodes("div")

    divs.
        Filter(flattenhtml.WithAttributeValueAs("class", "container")).
        Each(func(n *flattenhtml.Node) {
            val, _ := n.Attribute("id")

            fmt.Println(val)

            // Output:
            // target
        })
}

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
.github/workflows		.github/workflows
assets		assets
vendor		vendor
.gitignore		.gitignore
.golangci.yaml		.golangci.yaml
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
cursor.go		cursor.go
cursor_test.go		cursor_test.go
doc.go		doc.go
doc_test.go		doc_test.go
filteroptions.go		filteroptions.go
filteroptions_test.go		filteroptions_test.go
flattenhtml_test.go		flattenhtml_test.go
go.mod		go.mod
go.sum		go.sum
node.go		node.go
node_test.go		node_test.go
nodemanager.go		nodemanager.go
nodemanager_test.go		nodemanager_test.go
renovate.json		renovate.json
tagflattener.go		tagflattener.go
tagflattener_test.go		tagflattener_test.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Installation

Overview

Usage

About

Uh oh!

Releases 7

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

seinshah/flattenhtml

Folders and files

Latest commit

History

Repository files navigation

Installation

Overview

Usage

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 7

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages