2012年4月12日星期四

git基本用法


git的特点

git并不是存差异,而是存储完整的文件
git是分布式的,不用中心服务器
git的工作目录中的文件可以随时的从git目录".git"中的不同分支之间切换
git提交的不是当前工作目录的改动,需要手工的把改动提交到Index中,有三种状态: staged-已经提交到Index中的改动;modified-已经改动但没提交到Index中(之前已经提交到Index过);untracked-未加入到Index中改动。

git的概念

blob - 文件,存储文件内容
tree  - 目录, 指向tree或者blob
commit - 指向一个tree,标记当前的状态,记录元信息(作者,提交者,时间戳,之前的commit等)
tag - 用来给commit做标记

git的配置


git安装完后第一步是设置全局配置项,这个可以在用户目录下的.gitconfig文件下配置,包括user, email和autocrlf

git config --global core.autocrlf false#禁用换行自动转换
git config --global core.autocrlf input#开启输入换行自动转换,在提交时会将CRLF自动转换为LF
git config --global user.name "Jeromy Fu"
git config --global user.name "fuji246@gmail.com"

如果不是配置全局的,仅仅针对具体的项目配置,去掉上面的--global即可。 git有多种访问方式,git,http,ssh,git的效率更高,但http可以穿越防火墙。

git基本操作


git init -初始化仓库

git add file1 file2 -添加文件到Index(还未提交)

git diff -显示当前的改动(包括还未添加到Index中的)

git diff --cached -显示当前待提交(staged)的改动

git status -查看staged, modified和untracked的改动

git commit -提交Index中的改动到仓库

git commit -a -针对当前所有的改动(不包括新文件)一步完成git add和git commit的操作

在提交后,需要添加一些comment,最好的格式是第一行写一个概括性的描写,然后空格,然后是详细的描述,很多工具根据这个规则来将第一行作为邮件标题,详细描述作为邮件正文。

之前的一些工具新文件都是需要通过add命令来添加的,git的add也能添加新文件,但意义不太一样,是添加文件到Index中。

git branch branchname -添加新的分支

git branch -列出当前的所有的分支,并标识当前所在的分支

git checkout branchname -从git目录中(.git)取出branchname对应的分支

这里如果在创建分支之前文件已经更改,但还未提交,那么在checkout刚创建的分支时会自动合并文件。

git merge branchname -将当前分支和branchname分支合并

如果上面的merge有冲突,需要解决冲突再提交。可以使用git diff查看冲突

gitk - 显示当前分支改动的历史图示

git branch -d branchname -删除branchname的分支,(当前必须在其他分支),如果branchname的分支有未合并到当前分支(不一定是master)的内容,那么删除会失败,这个会确保branchname分支的内容会合并到其他的某个分支。

git branch -D branchname -强制删除branchname的分支
git diff -显示当前工作目录还未提交到Index中的改动,modified的。

git diff --cached -显示当前工作目录中已经提交到Index中的改动,staged的。

git diff HEAD -显示当前工作目录中所有改动,除untracked之外的。

git clone [uri] [dir] -克隆一个工程

git pull [uri] [branch] -从uri的特定分支拉代码,会自动合并

git remote add [name] [uri] -给远程的uri指定一个名称

git fetch [name]/[uri] -从uri的特定分支拉代码,不会自动合并。

git branch -r -查看远程的branch



git reset --hard 如果使用git init,push后需要使用这个命令才能看到新的内容,最好用git --bare init

git checkout -- file1 // 使用暂存区快照恢复工作目录文件,工作目录的文件修改被抛弃。
git reset HEAD file1 // 取消暂存区的文件快照


git clone后分支缺失,保存下面的shell脚本并执行,

#!/bin/bash
for branch in `git branch -a | grep remotes | grep -v HEAD | grep -v master`; do
    git branch --track ${branch##*/} $branch
done


参考:http://stackoverflow.com/questions/67699/how-do-i-clone-all-remote-branches-with-git


Git bare 和 non-bare 仓库 :
简单的说,bare的方式类似集中式的仓库,没有工作目录(working copy)。默认的
git init
创建的是non-bare的仓库,如果从其他地方push到这个non-bare的仓库,那么当前目录是不会出现最新的代码的,需要使用
git reset --hard
才能更新到最新的内容到当前工作目录。如果是作为中心仓库服务器,那么使用
git --bare init
来初始化仓库。

两者间转换:

bare -> non-bare: clone后删除之前的bare的就可以了
non-bare -> bare:
git clone --bare -l non_bare_repo new_bare_repo

参考链接:

http://sitaramc.github.com/concepts/bare.html
http://www.bitflop.com/document/111
 

2012年4月8日星期日

charset encoding guess

bigbluebutton上传txt文件有乱码,根本原因是openoffice不知道txt的编码,看了一下,有些第三方的工具可以猜测编码。另外关于编码的问题,可以参考

http://blogs.msdn.com/b/michkap/archive/2005/01/30/363308.aspx
http://blogs.msdn.com/b/oldnewthing/archive/2007/04/17/2158334.aspx

猜编码的开源项目
http://docs.codehaus.org/display/GUESSENC/Home(这个不猜,参考)
http://code.google.com/p/juniversalchardet/
http://jchardet.sourceforge.net/
http://site.icu-project.org/
http://cpdetector.sourceforge.net/
http://tika.apache.org/1.1/api/org/apache/tika/language/LanguageIdentifier.html

看介绍说juniversalchardet比jchardet准确率更高,打算先用这个试试看。

试用了一下,可以识别,还比较好用,准备给bbb提交patch了

2012年3月25日星期日

How to talk to the Modem with AT commands[转]


asterisk的chan_celliax,使用了AT commands来和phone交互。

http://www.asteriskwin32.com/chan_celliax.php

 

http://forum.xda-developers.com/showthread.php?t=1471241



2012年3月23日星期五

什么是Caller ID[转]

Caller ID

                                       Caller ID
      Your phone rings. A name pops upon on your phone's screen. It's the name and number of the person calling you. Actually, it's the originating telephone number and the name the phone company thinks is the subscriber. The originating telephone number is stored in the originating central office equipment register, which is a database. That number supports a further database lookup, which associates the directory listing, assuming that the originating number is listed (i.e., not unlisted, or "nonpub" for nonpublished). The name and number information is passed through the local and long distance networks, and appears on your Caller ID box or your display telephone between the first and second rings.
      The delivery of Caller ID information assumes several things. First, the entire network of switches must be supported by SS7 (Signaling System System #7). Second, the calling party must originate the call from a single-channel line, rather than a multichannel trunk (e.g., T-1). Third, the originating line/caller must not block the transmission of the information. If all of these criteria are not met, your Caller ID box will display "ANONYMOUS" or "NOT AVAILABLE." Caller ID is one of several CLASS (Custom Local Area Signaling Services) provided by your LEC (Local Exchange Carrier). There generally is both a small installation charge and a monthly charge for Caller ID. Caller ID lets you amaze your parents and scare your technophobic friends, when you answer the phone with something like "Hi, Harry! Great Dictionary!" Caller ID also helps you avoid those dinnertime calls from telemarketers. They always block their numbers. By the way, Caller ID is not the same as ANI, although they often are confused. See also ANI, Caller ID Message Format (for a very detailed explanation), and CLASS.
来电显示
      你的电话振铃。你的电话机屏幕上跳出一个名称。正是呼叫你的人的名称和电话号码。事实上,这正是电话公司认为的该用户的初始电话号码和名称。这个初始电话 号码被存储在最初的市话局设备寄存器,也就是数据库中。这个电话号码支持与号码指南列表相关联的进一步的数据库查询,假设这个初始号码被列在指南里面(即 没有未被列入或者是未被出版“nonpub”)。名称和号码信息通过本地和长途网络传输,出现在你的来电显示器上或者是你的第一次和第二次振铃之间的显示 电话机上。
      来电显示信息的传输假定了几种情况。首先,整个交换机网络必须支持 SS7(系统 7 号信令)。第二,主叫方必须从一个单信道线路而不是多信道中继线(即 T-1)发起呼叫。第三,主叫线路/呼叫方不得阻碍该信息的传输。如果这些标准全部不能满足,那么你的来电显示器上将显示“匿名”或者是“无效”。来电号 码是你的LEC(本地电话公司)提供的几种CLASS(客户本地信令服务)中的一种。通常来电显示有一小笔安装费和月租费。当你接起电话时说 “嗨,Harry,字典编的不错!”时,你会让你的父母吃一惊,也会让你的工学院朋友吓一跳。来电显示功能也会让你避免在吃饭时间接到来自电信市场的电 话。他们总是隐藏自己的电话号码。顺便说一句,来电显示与ANI不一样,尽管这两者总是容易搞混。参见 ANI、Caller ID Message Format (解释的非常详细)和 CLASS。
Caller ID Message Format
      Calling Number Delivery (CND) came about as an extension of Automatic Number Identification (ANI). ANI is a method that is used by telephone companies to identify the billing account for a toll call. Although ANI is not the service that provides the information for CID, it was the first to offer caller information to authorized parties. The CID service became possible with the implementation of Signaling System 7 (SS7). The CID information is transmitted on the subscriber loop using frequency shift keyed (FSK) modem tones. These FSK modem tones are used to transmit the display message in American Standard Code for Information Interchange (ASCII) character code form. The transmission of the display message takes place between the first and second ring. The information sent includes the date, time, and calling number. The name associated with the calling number is sometimes included also. Since the time CID was first made available, it has been expanded to offer CID on Call Waiting (CIDCW) as well. With CIDCW, the call waiting tone is heard and the identification of the second call is seen. In earlier editions of my dictionary, I included the complete formatting, down to individual bits. It's too technical for this dictionary. However, if you want the entire story in all its detail, go to http://www.testmark.com/develop/tml_callerid_cnt.html and read the article on "Caller ID Basics", by Michael W. Slawson of Intertek Testing Services, TestMark Laboratories. Michael has assured me that he will leave his excellent paper on the Web forever.
来电显示信息(CID)格式
      主叫号码传送(CND)是自动号码识别(ANI)的一个扩展功能。ANI是电话公司用来识别长途电话呼叫账单的一种方法。尽管 ANI 并不为 CID 提供信息服务,但是也是第一个为获得授权的被叫方提供主叫方信息的。使用7号信号系统后就可以提供 CID 服务了。使用移频键控(FSK)调制解调器音频可以在用户环路上传输 CID 信息。这些 FSK 调制解调器音频可以以美国信息交换标准码(ASCII)字符代码形式传输显示信息。显示信息的传输发生在第一次和第二次振铃之间。信息传输包括日期、时间 和主叫号码。与主叫号码相关的主叫方名称有时也包括在内。由于是第一次使用,因此对 CID 功能进行了扩展,同时还提供呼叫等待来电显示信息(CIDCW)。使用了 CIDCW,可以听到呼叫等待音,并且第二次呼叫的识别号也可以看到。在我这本辞典的早期版本中,我收纳了完整的格式,直到单个的比特。对这本辞典来说, 太专业了。然而,如果你希望看到所有详细的解释,可以查阅 http://www.testmark.comdevelop/tml_callerid_cnt.html,阅读“来电显示基础”一文,作者是 TestMark实验室 Intertek 测试服务中心的 Michael W. Slawson。Michael 已经向我保证了,他将永远在网站上留下他精彩的论文。

2012年3月22日星期四

关于协同编辑

协调编辑在某些情况下还是很有用的。

收集了一些资料, 有多种实现的方式,具体见Technical challenge, http://en.wikipedia.org/wiki/Collaborative_real-time_editor

CoEditor
----------

http://vhost1597.developer.ihost.com:8080/cowebx-apps/coedit/index.html
https://github.com/opencoweb

Client-side
Less than 2000 lines of Dojo-powered javascript

Server-side
A python web server capable of Operational Transformation, powered by the Open Cooperative Web Framework
 
https://github.com/opencoweb  
 
Gobby
-------------------
 
Gobby is a free collaborative editor supporting multiple documents in one session and a multi-user chat. It runs on Microsoft Windows, Mac OS X, Linux and other Unix-like platforms. 

http://gobby.0x539.de/trac/
 
moonedit
-------------

http://moonedit.com/

Mozilla Skywriter 
---------------------
 
https://github.com/mozilla/skywriter 
 
 
ACE
-------------
 
http://sourceforge.net/projects/ace/
 

EtherPad
---------------

有两个版本,算两个完全不同的实现吧。

http://en.wikipedia.org/wiki/EtherPad
http://etherpad.org/
http://code.google.com/p/etherpad/
https://github.com/ether/pad
https://github.com/pita/etherpad-lite

mobwrite
--------------
http://code.google.com/p/google-mobwrite/
 
 
协同编辑的理论基础
-------------------------
 
Operational transformation (OT) is a technology for supporting a range of collaboration functionalities in advanced groupware systems.  
 
http://en.wikipedia.org/wiki/Operational_transformation#OT_Control_.28Integration.29_Algorithms
 
 
最后,协同工作的理论
---------------------------
computer-supported cooperative work (CSCW)  

http://en.wikipedia.org/wiki/Computer_Supported_Cooperative_Work

2012年3月15日星期四

SCTP介绍

SCTP在功能上介于TCP和UDP之前,比TCP轻量级,但比UDP功能强。另外还有multi-home功能支持,能合理利用本地的多个接入网络。

http://tdrwww.exp-math.uni-essen.de/inhalt/forschung/sctp_fb/sctp_intro.html

http://www.bluestop.org/SctpDrv/

2012年3月14日星期三

关于ATM cell的一段描述[转]

Why cells?

Consider a speech signal reduced to packets, and forced to share a link with bursty data traffic (traffic with some large data packets). No matter how small the speech packets could be made, they would always encounter full-size data packets, and under normal queuing conditions, might experience maximum queuing delays. That is why all packets, or "cells," should have the same small size. In addition the fixed cell structure means that ATM can be readily switched by hardware without the inherent delays introduced by software switched and routed frames.
Thus, the designers of ATM utilized small data cells to reduce jitter (delay variance, in this case) in the multiplexing of data streams. Reduction of jitter (and also end-to-end round-trip delays) is particularly important when carrying voice traffic, because the conversion of digitized voice into an analogue audio signal is an inherently real-time process, and to do a good job, the decoder (codec) that does this needs an evenly spaced (in time) stream of data items. If the next data item is not available when it is needed, the codec has no choice but to produce silence or guess — and if the data is late, it is useless, because the time period when it should have been converted to a signal has already passed.
At the time of the design of ATM, 155 Mbit/s Synchronous Digital Hierarchy (SDH) with 135 Mbit/s payload was considered a fast optical network link, and many Plesiochronous Digital Hierarchy (PDH) links in the digital network were considerably slower, ranging from 1.544 to 45 Mbit/s in the USA, and 2 to 34 Mbit/s in Europe.
At this rate, a typical full-length 1500 byte (12000-bit) data packet would take 77.42 µs to transmit. In a lower-speed link, such as a 1.544 Mbit/s T1 line, a 1500 byte packet would take up to 7.8 milliseconds.
A queuing delay induced by several such data packets might exceed the figure of 7.8 ms several times over, in addition to any packet generation delay in the shorter speech packet. This was clearly unacceptable for speech traffic, which needs to have low jitter in the data stream being fed into the codec if it is to produce good-quality sound. A packet voice system can produce this low jitter in a number of ways:
  • Have a playback buffer between the network and the codec, one large enough to tide the codec over almost all the jitter in the data. This allows smoothing out the jitter, but the delay introduced by passage through the buffer would require echo cancellers even in local networks; this was considered too expensive at the time. Also, it would have increased the delay across the channel, and conversation is difficult over high-delay channels.
  • Build a system that can inherently provide low jitter (and minimal overall delay) to traffic that needs it.
  • Operate on a 1:1 user basis (i.e., a dedicated pipe).
The design of ATM aimed for a low-jitter network interface. However, "cells" were introduced into the design to provide short queuing delays while continuing to support datagram traffic. ATM broke up all packets, data, and voice streams into 48-byte chunks, adding a 5-byte routing header to each one so that they could be reassembled later. The choice of 48 bytes was political rather than technical.[4] When the CCITT (now ITU-T) was standardizing ATM, parties from the United States wanted a 64-byte payload because this was felt to be a good compromise in larger payloads optimized for data transmission and shorter payloads optimized for real-time applications like voice; parties from Europe wanted 32-byte payloads because the small size (and therefore short transmission times) simplify voice applications with respect to echo cancellation. Most of the European parties eventually came around to the arguments made by the Americans, but France and a few others held out for a shorter cell length. With 32 bytes, France would have been able to implement an ATM-based voice network with calls from one end of France to the other requiring no echo cancellation. 48 bytes (plus 5 header bytes = 53) was chosen as a compromise between the two sides. 5-byte headers were chosen because it was thought that 10% of the payload was the maximum price to pay for routing information.[3] ATM multiplexed these 53-byte cells instead of packets which reduced worst-case cell contention jitter by a factor of almost 30, reducing the need for echo cancellers.