strsplit

package
v1.130.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 29, 2026 License: MIT Imports: 3 Imported by: 0

Documentation

Overview

Package strsplit solves the common text-wrapping problem of splitting strings into bounded-size chunks without breaking Unicode characters and while trying to keep human-readable boundaries (spaces, punctuation, and line breaks).

Problem

Naive string slicing by byte index is unsafe for UTF-8 text and can split a multi-byte rune in the middle, producing invalid output. Even when output stays valid, hard cuts in the middle of words make messages difficult to read (for example in chat payload limits, SMS segmentation, logs, or fixed-size transport frames).

strsplit provides chunking helpers that are UTF-8 aware and separator-aware, so chunks stay valid and readable.

How It Works

The package exposes two functions:

  • Chunk: splits a full text block, prioritizing newline boundaries first, trimming whitespace per line, then delegating long lines to ChunkLine.
  • ChunkLine: splits a single line by maximum byte size, ensuring the split point is at a rune boundary and preferring the closest separator before the limit.

Separator preference order in ChunkLine:

  1. Unicode whitespace.
  2. Unicode punctuation (kept with the preceding chunk).
  3. Hard UTF-8-safe cut when no separator exists.

Both functions support an optional chunk limit `n`:

  • `n > 0`: return at most `n` chunks.
  • `n < 0`: unlimited chunks.
  • `n == 0`: return nil.

Key Features

  • UTF-8 safety: never cuts in the middle of a multi-byte rune.
  • Readability-aware splitting: prefers spaces and punctuation over arbitrary byte boundaries.
  • Newline-first semantics in Chunk: preserves natural paragraph structure.
  • Bounded output control via `n`, useful for APIs with strict item limits.
  • Deterministic trimming of leading/trailing whitespace in produced chunks.

Usage

chunks := strsplit.Chunk(text, 280, -1)    // split full text block
lineParts := strsplit.ChunkLine(line, 64, 3) // at most 3 chunks

This package is ideal for any Go application that needs robust, Unicode-aware message segmentation under byte-size constraints.

Index

Examples

Constants

This section is empty.

Variables

This section is empty.

Functions

func Chunk

func Chunk(s string, size, n int) []string

Chunk splits text block into substrings of max size at newline/separator boundaries, trimming whitespace and returning at most n chunks.

Example
package main

import (
	"fmt"

	"github.com/tecnickcom/gogen/pkg/strsplit"
)

func main() {
	str := "helloworld\nbellaciao"
	d := strsplit.Chunk(str, 5, 3)

	fmt.Println(d)

}
Output:
[hello world bella]

func ChunkLine

func ChunkLine(s string, size, n int) []string

ChunkLine splits single line into substrings of max byte size at UTF-8 boundaries, preferring whitespace/punctuation separators; returns at most n chunks.

Example
package main

import (
	"fmt"

	"github.com/tecnickcom/gogen/pkg/strsplit"
)

func main() {
	str := "hello,world"
	d := strsplit.ChunkLine(str, 8, -1)

	fmt.Println(d)

}
Output:
[hello, world]

Types

This section is empty.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL