Habachen: Unlocking the Power of Japanese Character Conversion
re you tired of manually converting Japanese text between full-width and half-width characters? Have you struggled with cumbersome methods to convert hiragana to katakana and vice versa? Look no further! Introducing Habachen, the high-speed and memory-efficient text conversion module that will simplify your Japanese text processing tasks.
Installation
Before we dive into the amazing capabilities of Habachen, let’s quickly install it:
#bash
pip install habachen
Usage
Habachen provides several convenient functions for character conversion. Let’s explore a few of them:
Converting Half-width to Full-width
#python
import habachen
text = 'abc!?012ハンカクモジ'
converted_text = habachen.han_to_zen(text)
print(converted_text)
# Output: 'abc!?012ハンカクモジ'
Converting Half-width Katakana Only
#python
import habachen
text = 'abc!?012ハンカクモジ'
converted_text = habachen.han_to_zen(text, ascii=False, digit=False, kana=True)
print(converted_text)
# Output: 'abc!?012ハンカクモジ'
Converting Full-width to Half-width
#python
import habachen
text = 'abc!?012ゼンカクモジ'
converted_text = habachen.zen_to_han(text)
print(converted_text)
# Output: 'abc!?012ゼンカクモジ'
Converting Full-width Katakana to Hiragana
#python
import habachen
text = 'モジレツノ変換'
converted_text = habachen.to_hiragana(text)
print(converted_text)
# Output: 'もじれつの変換'
Converting Hiragana to Katakana
#python
import habachen
text = 'もじれつの変換'
converted_text = habachen.to_katakana(text)
print(converted_text)
# Output: 'モジレツノ変換'
Benchmarks
To demonstrate the remarkable performance of Habachen, we conducted benchmarks using different conversion tasks. Here are the results:
Short Text (140 characters)
| Conversion Task | Habachen | mojimoji | jaconv |
|—|—|—|—|
| Full-width to Half-width | 1.319 µs | 11.92 µs | 11.22 µs |
| Half-width to Full-width | 1.147 µs | 10.15 µs | 26.49 µs |
| Hiragana to Katakana | 0.3674 µs | | 11.22 µs |
| Katakana to Hiragana | 0.3542 µs | | 10.97 µs |
Long Text (468,996 characters)
| Conversion Task | Habachen | mojimoji | jaconv |
|—|—|—|—|
| Full-width to Half-width | 2.607 ms | 55.07 ms | 40.36 ms |
| Half-width to Full-width | 1.832 ms | 33.89 ms | 57.16 ms |
| Hiragana to Katakana | 0.711 ms | | 38.72 ms |
| Katakana to Hiragana | 0.755 ms | | 40.36 ms |
Impressive, right? Habachen provides lightning-fast performance compared to other existing libraries in the market.
Conclusion
Habachen is a game-changer for anyone working with Japanese text processing. Its high-speed and memory-efficient character conversion capabilities make it a must-have tool. Whether you need to convert full-width characters to half-width, hiragana to katakana, or vice versa, Habachen is your go-to module.
Now go ahead and unleash the power of Habachen in your Japanese text processing tasks!
For more details, documentation, and an in-depth performance analysis, visit the Habachen GitHub Repository and the Habachen Documentation.
Also, check out the fascinating Qiita article (in Japanese) by the creator of Habachen.
Happy text processing with Habachen!
Leave a Reply