æ°é ãã®ããèªè ãã¿ã€ãã«ã®å¥åŠãªç¥èªã«ã€ããŠå°ããåã«ãç§ã¯ãŸã ããã€ãã®docãã¡ã€ã«ã®å 容ãèŠãããã«é Œã¿ãŸãïŒ
ã³ã³ãã¥ãŒã¿ãŒãªãã©ã·ãŒã®dææã«ç§ãã¡ã®å€ããã¡ã¢åž³ã§docãã¡ã€ã«ãéãããšãããšãåæ§ã®äºè£ãèŠããããšæããŸãã ãããããã®ãã€ãã®æ··ä¹±ããäœãåŸãããšãã§ããã®ããèªåèªèº«ã«å°ããŸããããããã¯åããåžãã«ä»ãªããŸããã ããã§æãèå³æ·±ãã®ã¯ããã¡ã€ã«ãããã¡ã€ã«ãžã®æåã®8ãã€ãã§ããã€ãŸããhex'ahã®
"D0 CF 11 E0 A1 B1 1A E1"
ããŸãã¯ã¡ã¢åž³ã®
"б"
ã§ãã
ã¿ã€ãã«ã®2çªç®ã®ç¥èªã解èªãã䟡å€ããããŸãã WCBFFã¯Windows Compound Binary File Formatã«ä»ãªããŸããããã·ã¢èªã§ã¯ã Windows Compound binary format files ãã®ããã«èãããŸãã 翻蚳ãäŒæ¥ã®è¯å¿ã«ä»»ããŠããã®åœ¢åŒãã²ã©ãååã§ã©ã®ããã«åœ¹ç«ã€ãèããŠã¿ãŸãããã
ãã®ãããCFBã¯ç¥å ããŸãã¯ããæ£ç¢ºã«ã¯ã97çªç®ã®ããŒãžã§ã³ãã2007幎ãŸã§ã®ãã¹ãŠã®Microsoft Office圢åŒã®ã¹ã±ã«ãã³ã§ãïŒäºææ§åœ¢åŒã§ä¿åããå ŽåïŒã ãã®CFBã¯ãWordã®ããã¹ããä¿åããããã ãã§ãªããExcelã·ãŒããŸãã¯PowerPointãã¬ãŒã³ããŒã·ã§ã³ãä¿åããããã«ã䜿çšãããŸãã ãã®çµæãCFBã§ãæå·åããããŠããããã¯ããŒã³ãèªã¿åããDOC圢åŒãèæ ®ããŠèªã¿åãããŒã¿ã®ããã¹ããèŠã€ããå¿ èŠããããŸãã
CFBãŸãã¯å°ããªãã¡ã€ã«ã·ã¹ãã
ç§ãèšã£ãããã«ãæåã®ã¹ãããã¯CFBãèªãããšã§ãã CFBã¯ãã»ã¯ã¿ãŒãã«ãŒããã£ã¬ã¯ããªãããã³ããçš®ã®ãã¡ã€ã«ãå«ãããããã¥ã¢ã®ãã¡ã€ã«æ§é ã瀺ããŸãã ãã®ãã¡ã€ã«ã®åé¡ã§ãããéåžžã®ãã¡ã€ã«ã·ã¹ãã ã®å Žåãšåãã§ã-ããšãã°ãæçåãããã»ã¯ã¿ãŒã 幞ããªããšã«ãMicrosoftã¯CFBãšä»ã®ãã¹ãŠã®ãäžéšæ§é ã圢åŒã®äž¡æ¹ã®ããã¥ã¡ã³ããæ°å¹Žã«ããã£ãŠå ¬éããŠããŸãã
æ å ±ãCFBãã¡ã€ã«ã«ã©ã®ããã«ããã¯ãããããç解ããŠã¿ãŸãããã ãã¡ã€ã«å šäœã¯ã»ã¯ã¿ãŒã«åå²ãããŸã-å512ãã€ãïŒæ°ããã4çªç®ã®ããŒãžã§ã³ã§ã¯ãã»ã¯ã¿ãŒãµã€ãºã¯4096ãã€ãã«ãªããŸãïŒã æåã®ã»ã¯ã¿ãŒã¯ãã¡ã€ã«ããããŒã§ããããã¯äžèšã®ã¹ã¯ãªãŒã³ã·ã§ããã§èŠããã®ã§ãã ããïŒããããŒïŒã«ã¯ããã¡ã€ã«ããèªã¿åãæ¹æ³ãå 容ãããã³é åºã«é¢ãããã¹ãŠã®æ å ±ãå«ãŸããŠããŸãã
ãã¡ã€ã«å ã®ããŒã¿ã¯ãåã512ãã€ãã®ã»ã°ã¡ã³ãïŒFATïŒã«æ ŒçŽãããŸãã ã»ã¯ã¿ãŒã»ã°ã¡ã³ãã«ååãªã¹ããŒã¹ããªãå Žåãæ®ãã®ããŒã¿ã¯ãã§ãŒã³å ã®æ¬¡ã®ããŒã¿ã«è»¢éãããŸãã ãã§ãŒã³ã»ã¯ã¿ãŒã¯ããã¡ã€ã«å šäœã«æ£åšããŠããå¯èœæ§ããããŸãïŒã€ãŸããäžèšã®ããã«ãã¡ã€ã«ãæçåã§ããŸãïŒã ã»ã¯ã¿ãŒãã§ãŒã³ã®æŽåæ§ãç¶æããããã«ããã¹ãŠã®ããŒã¿ãèªã¿åãããŠããªãå Žåã«çŸåšã®ã»ã¯ã¿ãŒããåãæ¿ããã»ã¯ã¿ãŒãå«ãç¹å¥ãªã»ã¯ã¿ãŒããããŸãã ãã§ãŒã³ã®çµããã¯ãç¹å¥ãªåèª
ENDOFCHAIN = 0xFFFFFFFE
ã«ãã£ãŠç¹åŸŽä»ããããŸãã
äžéšã®ããŒã¿ã§ã¯512ãã€ããéåžžã«å€§ãããªãå¯èœæ§ããããããããFATãšåŒã°ããããããã¥ã¢ãã»ã¯ã¿ãŒããããŸãã ããFATã»ã¯ã¿ãŒã®é·ãã¯64ãã€ãã§ããããããã®ãããªå°ããªã»ã°ã¡ã³ãã¯8åïŒãŸãã¯64åïŒã1ã€ã®FATã»ã¯ã¿ãŒã«åãŸããŸãã FATãŸãã¯ããFATã®éžæã¯ãçŸåšã®ããŒã¿ã®å šé·ã«åºã¥ããŠããŸãã 4096ãã€ãïŒãã¡ã€ã«ããããŒã®ãã©ã¡ãŒã¿ãŒã®1ã€ïŒæªæºã®å Žåãmini FATã䜿çšãã䟡å€ããããŸãããã以å€ã®å Žåã¯FATã§ãã
CFBãã¡ã€ã«å ã®ããŒã¿ã¯äœãç©ã¿äžããããŠããŸãã-ã«ãŒããšã³ããªã®ç¹å¥ãªããã¡ã€ã«ãšã³ããªãã«ã«ãŒããæã€ããªãŒæ§é ã«æ§é åãããŠããŸãã åãšã³ããªã®é·ãã¯128ãã€ãïŒ4ãŸãã¯32ãšã³ããªã1ã€ã®FATã»ã°ã¡ã³ãã«åãŸãïŒã§ããã®ååãã¿ã€ãïŒã¹ãã¬ãŒãž-ã¹ãã¬ãŒãžãã¹ããªãŒã -ã¹ããªãŒã ãã«ãŒãã¹ãã¬ãŒãž-ã«ãŒãã¹ãã¬ãŒãžã空ã®ã¹ããŒã¹-æªäœ¿çšïŒãåããã³ãå åŒãã«ãã£ãŠç¹åŸŽä»ããããŸãèŠçŽ ãèµ€ãšé»ã®æšã®è²ã ããã«ãã¹ããªãŒã ããã³ã«ãŒãèŠçŽ ã®å Žåãã³ã³ãã³ãã®ãªãã»ãããé·ããªã©ã®ãã©ã¡ãŒã¿ãŒãçºçããŸãã
ãããã£ãŠãFSãžã®åãšã³ããªã¯ããããã«æ·»ä»ãããã³ã³ãã³ããã«ãã£ãŠç¹åŸŽä»ããããŸãã ã¹ããªãŒã ã®å Žåãããã¯ã«ãŒãèŠçŽ ïŒããFATãã¡ã€ã«ïŒã«ä¿åãããããŒã¿ã«ãªããŸãã
ããã«ããã¡ã€ã«ã«ã¯DIFATãšåŒã°ããæ§é ããããFATã·ãŒã±ã³ã¹ã®ãã§ãŒã³ãæã€ã»ã¯ã¿ãŒãžã®ãªã³ã¯ãä¿åããŸãã æåã®109åã®DIFATãªã³ã¯ã¯ãã¡ã€ã«ããããŒã®æåŸã«ãããæ倧8.5 MBã®ãã¡ã€ã«ããæäŸãã§ããŸãããããååã§ãªãå ŽåãããããŒã«ã¯è¿œå ã®DIFATã»ã¯ã¿ãŒãžã®ãªã³ã¯ãå«ããããšãã§ããŸãã
ãã®æ å ±ã¯ãCFBãã¡ã€ã«ã§çºçããŠãããã¹ãŠã®æ··ä¹±ãšæºããç°¡åã«ç¹åŸŽã¥ããŸãã 圢åŒã¯ãååãšããŠããªãããææžåãããŠããŸãïŒãªã³ã¯ã¯ãããã¯ã®æåŸã«ãã€ãã©ããã§ãïŒãããã¥ã¢ã«ãææ ®æ·±ã綿å¯ã«èªãã ãã§ååã§ãã ãã®èšäºã®ç®çã¯ãCFBãã¡ã€ã«ã®æäœã«é¢ããå®å šãªèª¬æã§ã¯ãããŸããã§ããã®ã§ãäž»ãªããšã«ç§»ããŸããã-ãã®ãã¹ãŠããããã¥ã¡ã³ããèªãæ¹æ³...
DocãŸãã¯åœŒãã¯ç§ã®ãªãã»ãããçãã
ããããã3åç®ã®è©Šè¡ã§ã®ã¿cfbãšãšãã«ããã¥ã¡ã³ã解æãèšè¿°ãããšèšããŸãã ãã®åã«ãäœããã©ããããããããŸãããèªããŸããã§ããã ãããŠãããã¥ã¡ã³ãã«åŸã£ãŠãã¹ãŠãè¡ããªããã°ãªããªãã£ãçç±ã§ããã... CFBã§ããã倧ããªåé¡ãåŒãèµ·ãããªãå ŽåïŒæåèšèªãšããŠã®è±èªãé€ãïŒãDOCã®åé¡ãæäŸãããŸãã
ãŸããDOCã®ãã¡ã€ã«ã·ã¹ãã ãèªã¿åãããã®äžã®ããã¹ãããŒã¿ãæ¢ããŠããŸãã ãŸããä»æ§ãéãããã€ã¯ããœããã¯ç§ãã¡ã«èŽãç©ãããããããæ©äŒãäžããŠãããŸããã ãããè¡ãã«ã¯ãCFBãã¡ã€ã«ã®èŠçŽ ã®ããªãŒæ§é å ã®2ã€ã®ãšã³ããªã®ã¿ã䜿çšããŸããã
WordDocument
ããšããã¹ããªãŒã ãšãç¶æ³ã«å¿ããŠã
0Table
ããŸãã¯ã
0Table
ããšããã¹ããªãŒã ã§ãã
æåã®ã¹ããªãŒã ã«ã¯Wordææžã®ããã¹ããå«ãŸããŠããŸãããååŸã§ããŸããã ãã¹ãŠãã²ã©ããã€ããªã§ãããéãã€ãé ã®Unicodeãšã³ã³ãŒãã£ã³ã°ã®ãã®ä»ãã¹ãŠïŒãã¹ãŠã®CFBãã¡ã€ã«ã®ããã«ã泚ç®ã«å€ããŸãïŒã ãã®ç¹ã«é¢ããŠããŸããWordDocumentã¹ããªãŒã ã®å é ã«ãããããŒãžã§ã³ããããŒãžã§ã³ãžãšæžã蟌ãŸããFIBïŒ ãã¡ã€ã«æ å ±ãããã¯ïŒããããã€ãã®ãã£ãŒã«ããèªã¿åããŸãïŒWord 97ã§ã¯ããã®ã¿ã€ãã«ã¯çŽ700ãã€ãã§ãããã2007幎ã«ã¯ãã§ã«2000以äžã§ããïŒ ã
ãŸãããªãã»ãã
0x000A
ã®ã¯ãŒããèªã¿åããŸãã
0x000A
ã
0x0200
ããããèŠã€ããŸãããã®ãããã®åäœã¯ã
0Table
ããŒãã«ãåŠçãã
0Table
ãåŠçããããšã
0Table
ãŸãã äž¡æ¹ã®ããŒãã«ãæã€ãã¡ã€ã«ã«åºããããããšã¯æ³šç®ã«å€ããã®ã§ããããã«ããããããèªãå¿ èŠããããŸãã
次ã«ãCLXãèŠã€ããå¿ èŠããããŸã-æã
CompLeX
æ§é äœã¯ãWordDocumentã¹ããªãŒã ã«ããã¹ãããŒã¿ã®ã·ãŒã±ã³ã¹ã®ãªãã»ãããšé·ããæ ŒçŽããŸãã CLXã®é·ããšãªãã»ããã¯ãããã¥ã¡ã³ããããŒFIBã®
0x01A2
ããã³0x01A6 DWORDã«ãããŸãã ãã®æ å ±ãåãåã£ãåŸãããŒãã«ã¹ããªãŒã ããCLXãèªã¿åãããã©ã°ãèŠã€ããŸãã...
å®éãCLXã«ã¯ãå¯å€ãµã€ãºã®2ã€ã®å®å šã«ç°ãªãããŒã¿æ§é ïŒäžèŠãªRgPrcãšéèŠãªPlcPcdïŒãå«ãŸããŠããŸãã å®éãPgPrcã®é·ãã¯ãŒããŸãã¯ä»»æã®ããããã§ãã 幞ããªããšã«ãããã¥ã¡ã³ãã«ã¯æåã®ããŒã¿ã2çªç®ã®ããŒã¿ããåãé¢ãæ¹æ³ãèšèŒãããŠããªããããæçµçãªã³ãŒãã§ã¯äœããã®å¥åŠãªæ¹æ³ã§æŸèæãæžãå¿ èŠããããŸããã
PlcPcdãŸãã¯ååã§ããé©åãªããŒã¹ããŒãã«ãåãåã£ãåŸããã®é åã2ã€ã«åå²ã§ããŸãïŒé å
lcb i = cp i+1 - cp i
ããã¹ãããŒã¹ã®é·ãïŒ
lcb i = cp i+1 - cp i
ïŒããã³pcdïŒããŒã¹èšè¿°åïŒã åŸè ã®ããããã«ã¯ãWordDocumentã¹ããªãŒã ã®ãªãã»ãããš
fCompress
ç¹æ§ã«é¢ããæ å ±ïŒãã®éšåãUnicodeã§å§çž®ãããŠããããANSIïŒWindows-1252ïŒã§ãããïŒãå«ãŸããŠããŸãã
çµæãšããŠçããããŒã¹ã§ã¯ããªããžã§ã¯ããç»åã®æ¿å ¥ãªã©ãããã€ãã®å¶åŸ¡æåãçºçããå ŽåããããŸãã ç§ã®ã³ãŒãã§ã¯ããããã®äžéšãåé€ãããŸããæ®ãã®ç¹æ®æåã®è§£æã¯èªè ã«ä»»ããŸãã
ã³ãŒããªãã·ã§ã³
ããŠããã€ãã®ããã«ãã³ãŒãã®äžéšãšãœãŒã¹ãžã®ãªã³ã¯ïŒ
GitHubã«ã³ã¡ã³ããä»ããŠã³ãŒããååŸã§ããŸã ã
- ã¯ã©ã¹ããã¥ã¡ã³ã㯠cfbãæ¡åŒµããŸã {
- ãããªã㯠é¢æ°è§£æïŒ ïŒ {
- 芪:: 解æ ïŒ ïŒ ;
- $ wdStreamID = $ this- > getStreamIdByName ïŒ "WordDocument" ïŒ ;
- if ïŒ $ wdStreamID === false ïŒ { falseãè¿ã ; }
- $ wdStream = $ this- > getStreamById ïŒ $ wdStreamID ïŒ ;
- $ãã€ã = $ this- > getShort ïŒ 0x000A ã $ wdStream ïŒ ;
- $ fComplex = ïŒ $ãã€ã ïŒ 0x0004 ïŒ == 0x0004 ;
- $ fWhichTblStm = ïŒ $ãã€ã ïŒ 0x0200 ïŒ == 0x0200 ;
- $ fcClx = $ this- > getLong ïŒ 0x01A2 ã $ wdStream ïŒ ;
- $ lcbClx = $ this- > getLong ïŒ 0x01A6 ã $ wdStream ïŒ ;
- $ ccpText = $ this- > getLong ïŒ 0x004C ã $ wdStream ïŒ ;
- $ ccpFtn = $ this- > getLong ïŒ 0x0050 ã $ wdStream ïŒ ;
- $ ccpHdd = $ this- > getLong ïŒ 0x0054 ã $ wdStream ïŒ ;
- $ ccpMcr = $ this- > getLong ïŒ 0x0058 ã $ wdStream ïŒ ;
- $ ccpAtn = $ this- > getLong ïŒ 0x005C ã $ wdStream ïŒ ;
- $ ccpEdn = $ this- > getLong ïŒ 0x0060 ã $ wdStream ïŒ ;
- $ ccpTxbx = $ this- > getLong ïŒ 0x0064 ã $ wdStream ïŒ ;
- $ ccpHdrTxbx = $ this- > getLong ïŒ 0x0068 ã $ wdStream ïŒ ;
- $ lastCP = $ ccpFtn + $ ccpHdd + $ ccpMcr + $ ccpAtn + $ ccpEdn + $ ccpTxbx + $ ccpHdrTxbx ;
- $ lastCP + = ïŒ $ lastCP ïŒ= 0 ïŒ + $ ccpText ;
- $ tStreamID = $ this- > getStreamIdByName ïŒ intval ïŒ $ fWhichTblStm ïŒ ã "Table" ïŒ ;
- if ïŒ $ tStreamID === false ïŒ { falseãè¿ã ; }
- $ tStream = $ this- > getStreamById ïŒ $ tStreamID ïŒ ;
- $ clx = substr ïŒ $ tStream ã $ fcClx ã $ lcbClx ïŒ ;
- $ lcbPieceTable = 0 ;
- $ pieceTable = "" ;
- $ pieceCount = 0 ;
- $ from = 0 ;
- while ïŒ ïŒ $ i = strpos ïŒ $ clx ã chr ïŒ 0x02 ïŒ ã $ from ïŒ ïŒ ïŒ== false ïŒ {
- $ lcbPieceTable = $ this- > getLong ïŒ $ i + 1 ã $ clx ïŒ ;
- $ pieceTable = substr ïŒ $ clx ã $ i + 5 ïŒ ;
- if ïŒ strlen ïŒ $ pieceTable ïŒ ïŒ= $ lcbPieceTable ïŒ {
- $ from = $ i + 1 ;
- ç¶ãã ;
- }
- äŒæ© ;
- }
- $ cp = array ïŒ ïŒ ; $ i = 0 ;
- while ïŒ ïŒ $ cp [ ] = $ this- > getLong ïŒ $ i ã $ pieceTable ïŒ ïŒïŒ ïŒ= $ lastCP ïŒ
- $ i + = 4 ;
- $ pcd = str_split ïŒ substr ïŒ $ pieceTable ã $ i + 4 ïŒ ã 8 ïŒ ;
- $ text = "" ;
- for ïŒ $ i = 0 ; $ i < count ïŒ $ pcd ïŒ ; $ i ++ ïŒ {
- $ fcValue = $ this- > getLong ïŒ 2 ã $ pcd [ $ i ] ïŒ ;
- $ isANSI = ïŒ $ fcValue ïŒ 0x40000000 ïŒ == 0x40000000 ;
- $ fc = $ fcValue ïŒ 0x3FFFFFFF ;
- $ lcb = $ cp [ $ i + 1 ] - $ cp [ $ i ] ;
- if ïŒ ïŒ $ isANSI ïŒ
- $ lcb * = 2 ;
- ä»ã«
- $ fc / = 2 ;
- $ part = substr ïŒ $ wdStream ã $ fc ã $ lcb ïŒ ;
- if ïŒ ïŒ $ isANSI ïŒ
- $ part = $ this- > unicode_to_utf8 ïŒ $ part ïŒ ;
- $ text = $ part ;
- }
- $ textã è¿ã ãŸã ã
- }
- }
æåŠ
- [MS-CFB]ïŒè€åãã¡ã€ã«ãã€ããªãã¡ã€ã«åœ¢åŒ ;
- Windowsè€åãã€ããªãã¡ã€ã«åœ¢åŒã®ä»æ§ ã
- MICROSOFT OFFICE WORD 97-2007ãã€ããªãã¡ã€ã«åœ¢åŒã®ä»æ§ ã
- [MS-DOC]ïŒWord Binary File FormatïŒ.docïŒStructure Specification ;
- ãã€ããª.docãã¡ã€ã«ããããã¹ããååŸããæ¹æ³ ã
ãããã¯ããã¹ãŠã®è²»çšã§ã®ããã¹ããã«é¢ããä»ã®èšäºãžã®ãªã³ã¯ïŒ