Hosted onifebitcoin.orgvia theHypermedia Protocol

Text fragment rendering displays specific portions of document text (defined by character ranges) with all inline embeds resolved to their actual content.

Overview

When embedding a document fragment with a range (e.g., hm://account/doc#blockId[10:50]), the system needs to:

  1. Extract characters 10-50 from the block's text

  2. Resolve any inline embeds within that range

  3. Display the resulting text with embed names

The Challenge

Fragment ranges are specified using unicode code point positions in the original text, which:

  • Counts actual text characters

  • Excludes invisible inline embed markers (U+FEFF)

  • Uses unicode code points (not UTF-16 code units)

But we want to display text where:

  • Inline embeds are replaced with document names

  • Character positions map correctly to the original range

Solution: FragmentText Component

Component Location

frontend/packages/ui/src/document-content.tsx

How It Works

function FragmentText({
  documentId,
  blockRef,
  start,
  end,
}: {
  documentId: UnpackedHypermediaId
  blockRef: string
  start: number
  end: number
})

Process:

  1. Fetch Full Text: Call getDocumentText with the blockRef and resolveInlineEmbeds: true

    getDocumentText(
      {...documentId, blockRef, blockRange: null},
      {lineBreaks: false, resolveInlineEmbeds: true}
    )
    
  2. Extract Fragment: Use Array.from() to properly handle unicode code points

    const codePoints = Array.from(fullText)
    const fragment = codePoints.slice(start, end).join('')
    
  3. Display: Render the extracted text

    <Text className="whitespace-pre-wrap">{fragment}</Text>
    

Integration with ContentEmbed

The ContentEmbed component detects text fragments and renders them appropriately:

// Check if this is a text fragment (blockRef with start/end range)
const isTextFragment =
  props.blockRef &&
  props.blockRange &&
  'start' in props.blockRange &&
  'end' in props.blockRange

if (isTextFragment && props.blockRef && props.blockRange &&
    'start' in props.blockRange && 'end' in props.blockRange) {
  // Render as plain text with resolved embeds
  content = (
    <FragmentText
      documentId={narrowHmId(props)}
      blockRef={props.blockRef}
      start={props.blockRange.start}
      end={props.blockRange.end}
    />
  )
} else {
  // Normal block rendering
  // ...
}

Example Scenarios

Example 1: Simple Text Fragment

Original block text:

"Hello world, this is a test paragraph with some content."

Fragment: #blockId[0:11]

Result: "Hello world"

Example 2: Fragment with Inline Embed

Original block text (with invisible markers):

"Check out \uFEFF post about AI!"
// Position: 0-9, [10 = embed], 11-25

With inline embed resolved:

"Check out @Alice's Guide post about AI!"

Fragment: #blockId[0:20]

Result: "Check out @Alice's Guide pos" (first 20 unicode code points)

Example 3: Multiple Inline Embeds

Original text:

"Read \uFEFF and \uFEFF for more info"
// [Read ][embed1][ and ][embed2][ for more info]

With embeds resolved:

"Read @Getting Started and @Advanced Topics for more info"

Fragment: #blockId[0:25]

Result: First 25 code points with both embed names included

Character Position Mapping

Key Concepts

  1. Original Positions: Defined in the blockRange, count actual text excluding embed markers

  2. Resolved Text: After documentToText processes it, embeds become their document names

  3. Unicode Code Points: Use Array.from() to properly count multi-byte characters

Why Array.from()?

JavaScript strings are UTF-16 encoded. Emojis and special characters may use multiple UTF-16 code units:

// Wrong: UTF-16 code units
"Hello 👋".length // 7 (emoji uses 2 code units)

// Correct: Unicode code points
Array.from("Hello 👋").length // 6 (emoji is 1 code point)

API Endpoint Support

Fragment rendering works seamlessly in both desktop and web:

Desktop

  • Direct grpcClient access

  • Synchronous document fetching

  • Immediate text resolution

Web

  • Server-side API: /hm/api/document-text

  • Accepts blockRef in URL parameters

  • Returns resolved text via JSON

Component States

Loading

if (loading) {
  return (
    <div className="flex items-center justify-center p-2">
      <Spinner />
    </div>
  )
}

Error

if (error) {
  return <ErrorBlock message={`Failed to load fragment: ${error}`} />
}

Success

return (
  <Text className="whitespace-pre-wrap">
    {text}
  </Text>
)

Usage in Embeds

When a user creates an embed with a text range:

  1. Editor: User selects text range in a block

  2. Link Creation: System creates link like hm://account/doc#blockId[10:50]

  3. Rendering:

    • ContentEmbed detects the range

    • FragmentText fetches and extracts text

    • Displays resolved fragment

Performance Considerations

Caching

  • Consider caching getDocumentText results

  • Fragment extraction is fast (O(n) where n = text length)

  • Network requests may be slow on web

Optimization

  • Only fetch when fragment changes

  • UseEffect dependencies include all ID components

  • Loading state prevents UI jank

Dependencies

useEffect(() => {
  // ...
}, [
  getDocumentText,
  documentId.uid,
  documentId.path?.join('/'),
  documentId.version,
  blockRef,
  start,
  end
])

Testing

Test scenarios to verify:

  1. Basic fragment extraction: Simple text without embeds

  2. Single inline embed: Fragment includes embed name

  3. Multiple inline embeds: All embeds resolved correctly

  4. Unicode handling: Emojis and special characters

  5. Boundary cases: start=0, end=text.length

  6. Error handling: Missing blocks, network errors

Related Files

  • frontend/packages/ui/src/document-content.tsx - FragmentText component

  • frontend/packages/shared/src/document-to-text.ts - Text resolution

  • frontend/packages/shared/src/document-content-types.ts - Type definitions

  • frontend/apps/web/app/routes/hm.api.document-text.tsx - Web API

Do you like what you are reading? Subscribe to receive updates.

Unsubscribe anytime