Saturday 26 July 2014

Unicode and Encodings

Here is a summary of all things Unicode:

 Unicode
  • Unicode maps 32-bit (4 byte) integers (code points) to characters
  • The first 127 code points (hex values 00 to 7f) are the same as ASCII 
  • The next 128 code points (0×80-0xff) are the same as ISO-8859-1
  • An encoding is a mapping from bytes to Unicode code points 
Character Reference and Code Tables
Planes
  • A plane is a continuous group of 65,536 (= 2^16) code points 
  • There are 17 planes, identified by the numbers 0 to 16 
  • The Basic Multilingual Plane (BMP) is plane 0 (0000–​FFFF)
  • Planes 1–16, are called “supplementary planes” 
  • The code points in each plane have the hexadecimal values xx0000 to xxFFFF, where xx is a hex value from 00 to 10, signifying the plane to which the values belong
UTF-8 Encoding
UTF-16
  • Encodes code-points as one or two 16-bit code units
  • The code-points defined by the BMP are encoded as single 16-bit code units that are numerically equal to the corresponding code points
  • Code points from the Supplementary Planes are encoded by pairs of 16-bit code units called surrogate pairs: https://en.wikipedia.org/wiki/UTF-16#Code_points_U.2B010000_to_U.2B10FFFF
UTF-32 
  • Uses exactly 32 bits per Unicode code point.
  • The UTF-32 form of a character is a direct representation of its codepoint
  • Example: 00 00 00 61 is UTF-32 for Unicode code point 61, which is 'a' 
Byte Order Mark (BOM)
  • U+FEFF
  • If the endian architecture of the decoder matches that of the encoder, the decoder detects the 0xFEFF value, but an opposite-endian decoder interprets the BOM as the non-character value U+FFFE reserved for this purpose. This incorrect result provides a hint to perform byte-swapping for the remaining values
  • In UTF-16, a BOM (U+FEFF) may be placed as the first character of a file or character stream
  • The UTF-8 representation of the BOM is the byte sequence 0xEF,0xBB,0xBF
  • The Unicode Standard neither requires nor recommends the use of the BOM for UTF-8 
HTML 
  • HTML Entity: å (decimal) or å (hex) (= å)
URL Unicode Encoding
  • UTF-16: %uXXXX, e.g. %u00e9 -> é
  • UTF-8: %XX[%XX][%XX][%XX], e.g. %c2%a9 -> © %e2%89%a0 -> ≠
Compiled from:
  • http://www.darkcoding.net/software/finally-understanding-unicode-and-utf-8/ 
  • http://de.selfhtml.org/inter/unicode.htm
  • https://en.wikipedia.org/wiki/Plane_%28Unicode%29
  • https://en.wikipedia.org/wiki/UTF-8
  • https://en.wikipedia.org/wiki/UTF-16 
  • https://en.wikipedia.org/wiki/UTF-32
  • https://en.wikipedia.org/wiki/Byte_order_mark

Friday 25 July 2014

Video & Audio Containers & Codecs

"You may think of video files as “AVI files” or “MP4 files.” In reality, “AVI” and “MP4? are just container formats. Just like a ZIP file can contain any sort of file within it, video container formats only define how to store things within them, not what kinds of data are stored. (It’s a little more complicated than that, because not all video streams are compatible with all container formats, but never mind that for now.)

A video file usually contains multiple tracks — a video track (without audio), plus one or more audio tracks (without video). Tracks are usually interrelated. An audio track contains markers within it to help synchronize the audio with the video. Individual tracks can have metadata, such as the aspect ratio of a video track, or the language of an audio track. Containers can also have metadata, such as the title of the video itself, cover art for the video, episode numbers (for television shows), and so on." from http://diveintohtml5.info/video.html

Container​ ​Extension Common Video Codec​ Common Audio Codec​ Alfresco registered MimeType Comment​
​MPEG4 ​.mp4
.m4v
​H.264 ​AAC ​.mp4: video/mp4
.m4v: video/x-m4v
Developed by ISO.
​The MPEG 4 container is based on Apple’s older QuickTime container (.mov).
Can also be used to store other data such as subtitles and still images.
MP4 files can contain metadata as defined by the format standard, and in addition, can contain Extensible Metadata Platform (XMP) metadata.
More recent versions of Flash also support the MPEG 4 container.
​WEBM ​.webm ​VP8 ​Vorbis ​video/webm Audio-video format designed to provide a royalty-free, open video compression format for use with HTML5 video. Development is sponsored by Google.
Based on Matroska Media Container
Adobe has also announced that a future version of Flash will support WebM video.
​OGG .ogv ​Theora (=Ogg Video) ​Vorbis (=Ogg Audio) video/ogg ​Ogg is an open standard, open source–friendly, and unencumbered by any known patents
Ogg is a free, open container format maintained by the Xiph.Org Foundation.
The Ogg container format can multiplex a number of independent streams for audio, video, text (such as subtitles), and metadata.
​Flash Video ​.flv ​​H.264
VP6Sorenson Spark
​AAC
MP3
​video/x-flv ​Developed by Adobe Systems
Prior to Flash 9.0.60.184 (a.k.a. Flash Player 9 Update 3), this was the only container format that Flash supported
Audio Video Interleave​ ​.avi ​MPEG-4 part 2 ​MP3 ​video/x-msvideo ​The AVI container format was invented by Microsoft in a simpler time.
It does not even officially support most of the modern video and audio codecs in use today.
​Matroska .mkv ​H.264 ​Vorbis ​The Matroska Multimedia Container is an open standard free container format, a file format that can hold an unlimited number of video, audio, picture or subtitle tracks in one file
RealMedia​ .rm ​RealVideo ​RealAudio ​RealMedia is a proprietary multimedia container format created by RealNetworks. It is used for streaming content over the Internet.
​3GP .3gp ​​H.264
...
​AAC
...
​It is used on 3G mobile phones but can also be played on some 2G and 4G phones.
3G2 ​H.264
...
​AAC
...
​video/x-3gpp2 ​It is very similar to the 3GP file format, but has some extensions and limitations in comparison to 3GP.
​QuickTime ​.mov
.qt
​H.264 ​AAC ​video/quicktime ​Apple Inc.
Multimedia container file that contains one or more tracks, each of which stores a particular type of data: audio, video, effects, or text (e.g. for subtitles)
​Advanced Systems Format ​.asf
.wmv
​Windows Media Video ​Windows Media Audio ​video/x-ms-asf
video/x-ms-wmv
​Microsoft's proprietary digital audio/digital video container format, especially meant for streaming media.
Files containing only WMA audio can be named using a .WMA extension, and files of audio and video content may have the extension .WMV.
Both may use the .ASF extension if desired.


​Container Extension​ Common Audio Codec​ Alfresco registered Mime Type​ Comment​
OGG​ .oga
.ogg
Vorbis (=Ogg Audio) ​audio/ogg ​Lossy audio compression
Xiph.Org Foundation recommends that .ogg only be used for Ogg Vorbis audio files.
​MP3 ​.mp3 ​MP3 ​audio/x-mpeg ​Lossy audio compression
An MP3 file that is created using the setting of 128 kbit/s will result in a file that is about 1/11 the size than the CD file
created from the original audio source.
Several bit rates are specified in the MPEG-1 Audio Layer III standard: 32, 40, 48, 56, 64, 80, 96, 112, 128, 160, 192, 224, 256 and 320 kbit/s,
and the available sampling frequencies are 32, 44.1 and 48 kHz.
Additional extensions were defined in MPEG-2 Audio Layer III: bit rates 8, 16, 24, 32, 40, 48, 56, 64, 80, 96, 112, 128, 144, 160 kbit/s
and sampling frequencies 16, 22.05 and 24 kHz
A sample rate of 44.1 kHz is almost always used, because this is also used for CD audio, the main source used for creating MP3 files.
Most MP3 files today contain ID3 metadata
MP3 format allows for variable bitrate encoding, which means that some parts of the encoded stream are compressed more than others
​WAV ​.wav ​PCM ​audio/x-wav ​Microsoft & IBM
Advanced Audio Coding​ .m4a
.aac
​AAC ​audio/aac ​​Lossy audio compression.
AAC generally achieves better sound quality than MP3 at similar bit rates.
AAC is also the default or standard audio format for iPhone, iPod, iPad, Nintendo DSi, iTunes and PlayStation 3.
​​​Matroska ​.mka ​Vorbis
Advanced Systems Format ​.wma ​MP3 ​audio/x-ms-wma An audio data compression technology developed by Microsoft.
The name can be used to refer to its audio file format or its audio codecs.

FFmpeg

 
Generelle Syntax: ffmpeg [global options] [[infile options][‘-i’ infile]]... {[outfile options] outfile}...
 

Allgemeine Optionen

​Option ​Beschreibung ​Beispiel
​-i ​Bestimmt die Quelldatei (Input-File) und listet Informationen (Metadaten, Bitrate, Codierung, etc. ) über die Datei auf ​ffmpeg -i lala.mp3
-codecs​ ​Listet alle verfügbaren Codecs auf ffmpeg -codecs
-formats​ ​Listet alle verfügbaren Formate auf ffmpeg -formats
 

Wichtige Audio-Optionen

​Option Beschreibung​ ​Beispiel
​-acodec ​Der Audio-Codec mit dem die Zieldatei codiert werden soll, z.B. libvorbis, libmp3lame

Um den Codec der Quelldatei beizubehalten kann man den speziellen Wert 'copy' verwenden - es findet also keine Transcodierung statt: -acodec copy
​ffmpeg -i lala.mp3 -acodec libvorbis lala.ogg
-ab​ ​Die Bitrate mit der die Zieldatei kodiert wird. Eine geringere Bitrate veringert die Dateigröße aber auch die Qualität.

Es macht keine Sinn eine höhere Bitrate für die Zieldatei zu definieren als die Quelldatei hat.
​ffmpeg -i zzz.mp3 -ab 64k  zzz2.mp3
​-aq ​Die Audioqualität; für codecs mit variabler Bitrate
-ar​ ​Die Sampling Frequency in Hertz.

Es macht keine Sinn eine höhere Frequenz für die Zieldatei zu definieren als die Quelldatei hat.
​ffmpeg -i zzz.mp3 -ar 22050 -ab 96k zzz2.mp3
-ss ​When used as an input option (before -i), seeks in this input file to position. When used as an output option (before an output filename), decodes but discards input until the timestamps reach position. This is slower, but more accurate. Position may be either in seconds or in hh:mm:ss[.xxx] form. ffmpeg -ss 00:00:30.00 -t 25 -i bar.mp3 -acodec copy bar-new.mp3
​-t ​Stop writing the output after its duration reaches duration. duration may be a number in seconds, or in hh:mm:ss[.xxx] form.
​-ac ​Set the number of audio channels. For output streams it is set by default to the number of input audio channels. ​ffmpeg -i zzz.mp3 -ac 1 zzz2.mp3
 

Wichtige Video-Optionen

​Option ​Beschreibung ​Beispiel
-b​ ​Bitrate ​-b 2000k
-vcodec​ Der Video-Codec ​-vcodec mpeg4
​-vcodec copy
-s​ ​Bildgröße ​-s 320x240
-s xga
-aspect​ ​-aspect 4:3
​-target ​Vordefinierte targets (All the format options (bitrate, codecs, buffer sizes) are then set automatically) ​-target ntsc-dvd
-r​ ​Frame rate ​-r 10
​-f ​Container format ​-f avi
-ss​ When used as an input option (before -i), seeks in this input file to position. When used as an output option (before an output filename), decodes but discards input until the timestamps reach position. This is slower, but more accurate. Position may be either in seconds or in hh:mm:ss[.xxx] form. ​Extract image: Einen Frame bei Sekunde fünf über eine Sekunde (bei einer Famerate von einem Frame pro Sekunde) mit einer Größe von 320x240 extrahieren
-r 1 -t 1 -ss 5 -s 320x240
​-t Stop writing the output after its duration reaches duration. duration may be a number in seconds, or in hh:mm:ss[.xxx] form.

A FFmpeg Tutorial For Beginners
Using ffmpeg to manipulate audio and video files
FFmpeg – the swiss army knife of Internet Streaming

Tuesday 8 July 2014

Remove a single entry from a MyBatis cache programatically

MyBatis automatically flushes its cache on a insert/update/delete statement. But what if you need to flush a item from the cache because its database representation has been changed by a different application? Here is how to remove a single entry from a MyBatis cache programatically:


public Object removeCacheEntry(SqlSession sqlSession, String cacheId, String mappingName, Object parameterObject) {
    Object removedObject = null;
    Cache cache = getCache(sqlSession, cacheId);
    if (cache != null) {
        CacheKey cacheKey = getCacheKey(sqlSession, mappingName, parameterObject);
        if (cacheKey != null) {
            removedObject = cache.removeObject(cacheKey);
            logger.info("Remove from cache: {} by key {}", removedObject, cacheKey);
        }
    }
    return removedObject;
}

private Cache getCache(SqlSession sqlSession, String cacheId) {
    return sqlSession.getConfiguration().getCache(cacheId);
}

private CacheKey getCacheKey(SqlSession sqlSession, String mappingName, Object parameterObject) {
    Configuration configuration = sqlSession.getConfiguration();
    SimpleExecutor executor = new SimpleExecutor(configuration, null);
    MappedStatement mappedStatement = configuration.getMappedStatement(mappingName);
    BoundSql boundSql = mappedStatement.getBoundSql(parameterObject);
    return executor.createCacheKey(mappedStatement, parameterObject, RowBounds.DEFAULT, boundSql);
}

For example:


removeCacheEntry(qlSession, "com.test.MyEntityMapper", "com.test.MyEntityMapper.getById", 1);